Stabilizing Reinforcement Learning with LLMs: Formulation and Practices Paper • 2512.01374 • Published 7 days ago • 78
From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence Paper • 2511.18538 • Published 14 days ago • 240
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 7 days ago • 224
INTELLECT-3 Collection INTELLECT-3: A 100B+ MoE trained with large-scale RL • 4 items • Updated 9 days ago • 11
view article Article Apriel-H1: The Surprising Key to Distilling Efficient Reasoning Models 19 days ago • 26
NeMo Gym Collection Collection of RL verifiable data for NeMo Gym • 8 items • Updated 4 days ago • 8
gpt-oss Collection Open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases. • 2 items • Updated Aug 7 • 390
gpt-oss-safeguard Collection gpt-oss-safeguard-120b and gpt-oss-safeguard-20b are safety reasoning models built-upon gpt-oss • 2 items • Updated Oct 29 • 58
view article Article Building the Open Agent Ecosystem Together: Introducing OpenEnv +8 Oct 23 • 134
Less is More: Recursive Reasoning with Tiny Networks Paper • 2510.04871 • Published Oct 6 • 493
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models Paper • 2504.10449 • Published Apr 14 • 15
Tiny Language Model Datasets Collection Collection of Synthetic Datasets that can be used in pretraining of any the Tiny Language Model • 14 items • Updated Sep 21 • 29
Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks Paper • 2508.18672 • Published Aug 26 • 10
Fantastic Pretraining Optimizers and Where to Find Them Paper • 2509.02046 • Published Sep 2 • 13
AWorld: Orchestrating the Training Recipe for Agentic AI Paper • 2508.20404 • Published Aug 28 • 38