Reinforcement Learning for Reasoning in Large Language Models with One Training Example Paper • 2504.20571 • Published Apr 29, 2025 • 98
QuaDMix: Quality-Diversity Balanced Data Selection for Efficient LLM Pretraining Paper • 2504.16511 • Published Apr 23, 2025 • 22
InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning Paper • 2409.12568 • Published Sep 19, 2024 • 50
InfiMM-HD: A Leap Forward in High-Resolution Multimodal Understanding Paper • 2403.01487 • Published Mar 3, 2024 • 16