Mistral Large 3 Collection A state-of-the-art, open-weight, general-purpose multimodal model with a granular Mixture-of-Experts architecture. • 4 items • Updated 4 days ago • 70
Ministral 3 Collection A collection of edge models, with Base, Instruct and Reasoning variants, in 3 different sizes: 3B, 8B and 14B. All with vision capabilities. • 9 items • Updated 4 days ago • 113
Step-Audio-R1 Collection Step-Audio-R1 is the first audio language model to successfully unlock test-time compute scaling. • 3 items • Updated 16 days ago • 15
Olmo 3 Post-training Collection All artifacts for post-training Olmo 3. Datasets follow the model that resulted from training on them. • 32 items • Updated 5 days ago • 37
Olmo 3 Pre-training Collection All artifacts related to Olmo 3 pre-training • 10 items • Updated 7 days ago • 26
DR Tulu Collection Models and data associated with DR Tulu, http://allenai-web/papers/drtulu • 5 items • Updated 12 days ago • 30
Lychee-Uni-MoE 2.0 Collection The second version of omnimodal large model Uni-MoE • 7 items • Updated 13 days ago • 6
Holo2 Collection Holo2 - Cost-Efficient Models for Cross-Platform Computer-Use Agents • 3 items • Updated 23 days ago • 21
Jan-v2-VL Collection Jan-v2-VL: an 8B VLM focused on reliable, many-step task execution. • 6 items • Updated 24 days ago • 36
MobileLLM-R1 Collection MobileLLM-R1, a series of sub-billion parameter reasoning models • 10 items • Updated 15 days ago • 27
Qwen3-VL Collection Qwen's new multimodal vision models in GGUF, safetensor, and dynamic Unsloth formats. • 56 items • Updated 4 days ago • 17