SigLino-30M

Accepted at CVPR 2026

This work stems from the CVPR 2026 AMoE paper, which designs and applies distillation into a Mixture-of-Experts (MoE) vision architecture. We have chosen the name SigLino for better clarity (SigLIP2 + DINOv3).

Dense variant of SigLino. 30M parameters.

Part of the SigLino model family.

Usage

import torch
from PIL import Image
from transformers import AutoModel, AutoImageProcessor

model_id = "tiiuae/siglino-30M"
model = AutoModel.from_pretrained(model_id, trust_remote_code=True).to("cuda", dtype=torch.bfloat16)
processor = AutoImageProcessor.from_pretrained(model_id, trust_remote_code=True)

image = Image.open("image.jpg").convert("RGB")
inputs = processor(image, return_tensors="pt").to("cuda")
inputs["pixel_values"] = inputs["pixel_values"].to(torch.bfloat16)

with torch.no_grad():
    outputs = model(**inputs)

# Options: 'siglino' (384d), 'siglip2' (1152d), 'dinov3' (1024d)
patch_features = outputs["patch_features"]["siglino"]         # (Batch, Tokens, 384)
summary_features = outputs["summary_features"]["siglip2"]  # (Batch, 1152)

Model Details

Property	Value
Architecture	Dense
Parameters	0.03B
Layers	12
Hidden Dim	384
FFN Dim	1536
Patch Size	16x16
Teachers	DINOv3, SigLIP2

Results (512x512, ensemble features)

Task	Metric	Score
kNN (ImageNet)	Acc	79.0
kNN (6-dataset avg)	Acc	83.3
Zero-shot cls (ImageNet)	Acc	65.1
Flickr30K I2T	R@1	82.2
MSCOCO I2T	R@1	59.7
Pascal VOC (1024)	mIoU	82.1
Cityscapes (1024)	mIoU	59.2

Citation

@article{chaybouti2025amoe,
  title={AMoE: Agglomerative Mixture-of-Experts Vision Foundation Models},
  author={Chaybouti, Sofian and Narayan, Sanath and Dahou, Yasser and Le Khac, Phuc H. and Singh, Ankit and Huynh, Ngoc Dung and Para, Wamiq Reyaz and Kuehne, Hilde and Hacid, Hakim},
  journal={arXiv preprint arXiv:2512.20157},
  year={2025}
}

Downloads last month: 26

Safetensors

Model size

47.1M params

Tensor type

F32

Collection including tiiuae/siglino-30M

SigLino: Vision Foundation Models (SigLIP2 + DINOv3)

Collection

Vision encoders distilled from DINOv3 and SigLIP2 (MoE & Dense). Stems from the CVPR 2026 AMoE paper. • 5 items • Updated 1 day ago • 4

Paper for tiiuae/siglino-30M

AMoE: Agglomerative Mixture-of-Experts Vision Foundation Model

Paper • 2512.20157 • Published Dec 23, 2025 • 2