YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Jina Embeddings V4 (Nova Edition)

This fork of jinaai/jina-embeddings-v4 is tailored for Nova deployments. It bundles the original base checkpoint, updated projector-only adapters, and a Nova-oriented chat template so you can drop the model into a worker without extra patching.

Contents

  • model.safetensors, config, tokenizer, and processor files copied from the upstream release.
  • chat_template.json: identical to Jina/Qwen2.5-VL and required for proper prompt formatting.
  • adapters/
    • retrieval/adapter_model.safetensors
    • text-matching/adapter_model.safetensors
    • code/adapter_model.safetensors
    • Each adapter ships with a corrected adapter_config.json and targets only the multi-vector projector (rank 32).

Launching Nova

nova serve remodlai/jina-embeddings-v4-nova \
  --trust-remote-code \
  --is-multi-vector-embeddings \
  --enable-lora \
  --max-lora-rank 32 \
  --max-loras 3 \
  --chat-template /workspace/models/jina/chat_template.json \
  --load-lora retrieval=/workspace/models/jina/adapters/retrieval/adapter_model.safetensors
  • Load text-matching and code adapters with additional --load-lora flags or the /v1/internal/lora/load endpoint.
  • Keep --max-lora-rank aligned with the adapter rank (32) to avoid Punica warm-up failures.
  • The worker automatically flattens multimodal prompts into the expected <|vision_start|>โ€ฆ placeholder string before applying the chat template.

Request Guidelines

  • Include task per item ("retrieval", "text-matching", "code"). Optional adapter overrides are supported when multiple adapters are active.
  • For images, supply image (URL, bytes, base64, or list) or image_embeds; the worker converts them to PIL.Image.Image objects for the processor.
  • Single-vector callers can set "return_multivector": false and optionally "dimensions": 512 (etc.) for matryoshka truncation.

Example (single-vector response with instructions):

curl -X POST http://localhost:8000/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{
        "model": "remodlai/jina-embeddings-v4-nova",
        "encoding_format": "float",
        "return_multivector": false,
        "dimensions": 512,
        "instructions": "Focus on cheese-specific details when comparing passages.",
        "input": [
          {
            "task": "retrieval",
            "adapter": "retrieval",
            "text": "Describe the trend shown in this chart",
            "image": "https://example.org/chart.png"
          },
          {
            "task": "text-matching",
            "adapter": "text-matching",
            "text": "A beautiful sunset over the beach"
          }
        ]
      }'

Change "encoding_format": "base64" if you prefer base64-encoded vectors. Omit "return_multivector": false to receive the default multi-vector output (one 128-d vector per chunk).

Notes for Fork Consumers

  • This model remains subject to the Qwen Research License (inherited from Qwen2.5-VL-3B-Instruct).
  • Upstream documentation (training details, benchmarks, etc.) can be found in the original repository: https://huggingface.co/jinaai/jina-embeddings-v4
  • If you publish updates, please note that adapters are projector-only and expect the Jina V4 chat template to be applied before tokenization.
Downloads last month
4
Safetensors
Model size
4B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support