Nemotron Speech Collection Open, state-of-the-art, production‑ready enterprise speech models from the NVIDIA Speech research team for ASR, TTS, Speaker Diarization and S2S • 17 items • Updated 5 days ago • 29
LightOnOCR: A 1B End-to-End Multilingual Vision-Language Model for State-of-the-Art OCR Paper • 2601.14251 • Published 5 days ago • 21
Fast-ThinkAct: Efficient Vision-Language-Action Reasoning via Verbalizable Latent Planning Paper • 2601.09708 • Published 11 days ago • 50
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization Paper • 2601.05242 • Published 17 days ago • 206
PACEvolve: Enabling Long-Horizon Progress-Aware Consistent Evolution Paper • 2601.10657 • Published 10 days ago • 19
ArenaRL: Scaling RL for Open-Ended Agents via Tournament-based Relative Ranking Paper • 2601.06487 • Published 15 days ago • 50
Can We Predict Before Executing Machine Learning Agents? Paper • 2601.05930 • Published 16 days ago • 26