YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Jina Embeddings V4 (Nova Edition)
This fork of jinaai/jina-embeddings-v4 is tailored for Nova deployments. It bundles the original base checkpoint, updated projector-only adapters, and a Nova-oriented chat template so you can drop the model into a worker without extra patching.
Contents
model.safetensors, config, tokenizer, and processor files copied from the upstream release.chat_template.json: identical to Jina/Qwen2.5-VL and required for proper prompt formatting.adapters/retrieval/adapter_model.safetensorstext-matching/adapter_model.safetensorscode/adapter_model.safetensors- Each adapter ships with a corrected
adapter_config.jsonand targets only the multi-vector projector (rank 32).
Launching Nova
nova serve remodlai/jina-embeddings-v4-nova \
--trust-remote-code \
--is-multi-vector-embeddings \
--enable-lora \
--max-lora-rank 32 \
--max-loras 3 \
--chat-template /workspace/models/jina/chat_template.json \
--load-lora retrieval=/workspace/models/jina/adapters/retrieval/adapter_model.safetensors
- Load
text-matchingandcodeadapters with additional--load-loraflags or the/v1/internal/lora/loadendpoint. - Keep
--max-lora-rankaligned with the adapter rank (32) to avoid Punica warm-up failures. - The worker automatically flattens multimodal prompts into the expected
<|vision_start|>โฆplaceholder string before applying the chat template.
Request Guidelines
- Include
taskper item ("retrieval","text-matching","code"). Optionaladapteroverrides are supported when multiple adapters are active. - For images, supply
image(URL, bytes, base64, or list) orimage_embeds; the worker converts them toPIL.Image.Imageobjects for the processor. - Single-vector callers can set
"return_multivector": falseand optionally"dimensions": 512(etc.) for matryoshka truncation.
Example (single-vector response with instructions):
curl -X POST http://localhost:8000/v1/embeddings \
-H "Content-Type: application/json" \
-d '{
"model": "remodlai/jina-embeddings-v4-nova",
"encoding_format": "float",
"return_multivector": false,
"dimensions": 512,
"instructions": "Focus on cheese-specific details when comparing passages.",
"input": [
{
"task": "retrieval",
"adapter": "retrieval",
"text": "Describe the trend shown in this chart",
"image": "https://example.org/chart.png"
},
{
"task": "text-matching",
"adapter": "text-matching",
"text": "A beautiful sunset over the beach"
}
]
}'
Change "encoding_format": "base64" if you prefer base64-encoded vectors. Omit "return_multivector": false to receive the default multi-vector output (one 128-d vector per chunk).
Notes for Fork Consumers
- This model remains subject to the Qwen Research License (inherited from
Qwen2.5-VL-3B-Instruct). - Upstream documentation (training details, benchmarks, etc.) can be found in the original repository: https://huggingface.co/jinaai/jina-embeddings-v4
- If you publish updates, please note that adapters are projector-only and expect the Jina V4 chat template to be applied before tokenization.
- Downloads last month
- 4
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support