Running on CPU Upgrade Featured 2.54k The Smol Training Playbook 📚 2.54k The secrets to building world-class LLMs
Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning Paper • 2509.22601 • Published Sep 26 • 29
Running on CPU Upgrade 13.7k Open LLM Leaderboard 🏆 13.7k Track, rank and evaluate open LLMs and chatbots
Post-training-Data-Flywheel/AutoIF-instruct-61k-with-funcs Viewer • Updated Oct 3, 2024 • 61.5k • 265 • 6
TinyGSM: achieving >80% on GSM8k with small language models Paper • 2312.09241 • Published Dec 14, 2023 • 40