Esper 3.1 Collection Esper 3.1 is a DevOps, architecture, code, and general reasoning finetune for Qwen, Ministral and gpt-oss! • 5 items • Updated 3 days ago • 1
view article Article Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders Jul 9 • 722
CodeContests+: High-Quality Test Case Generation for Competitive Programming Paper • 2506.05817 • Published Jun 6 • 9
OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling Paper • 2506.20512 • Published Jun 25 • 47
🐙 OctoThinker Collection Mid-training Incentivizes Reinforcement Learning Scaling • 18 items • Updated Jun 26 • 2
view changelog Changelog Organization and User profiles now include repository listing pages Jun 20 • 131
Esper 3 Collection Esper 3 is a DevOps, architecture, code, and general reasoning finetune for Qwen 3! • 4 items • Updated 3 days ago • 3
CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings Paper • 2501.01257 • Published Jan 2 • 52
Llama-3.1-Nemotron-70B Collection SOTA models on Arena Hard and RewardBench as of 1 Oct 2024. • 6 items • Updated 3 days ago • 155
HelpSteer2-Preference: Complementing Ratings with Preferences Paper • 2410.01257 • Published Oct 2, 2024 • 24
Llama 3.x Models Collection Our models built with Llama 3, 3.1, and 3.2 • 10 items • Updated 3 days ago • 3
Llamafied Yi Collection Yi base models converted to Llama architecture. • 4 items • Updated Nov 14, 2023 • 9
Open LLM Leaderboard best models ❤️🔥 Collection A daily uploaded list of models with best evaluations on the LLM leaderboard: • 65 items • Updated Mar 20 • 652