Claw-Eval: Toward Trustworthy Evaluation of Autonomous Agents Paper β’ 2604.06132 β’ Published 2 days ago β’ 100
MARS: Modular Agent with Reflective Search for Automated AI Research Paper β’ 2602.02660 β’ Published Feb 2 β’ 66
Good SFT Optimizes for SFT, Better SFT Prepares for Reinforcement Learning Paper β’ 2602.01058 β’ Published Feb 1 β’ 43
PaperBanana: Automating Academic Illustration for AI Scientists Paper β’ 2601.23265 β’ Published Jan 30 β’ 222
T2AV-Compass: Towards Unified Evaluation for Text-to-Audio-Video Generation Paper β’ 2512.21094 β’ Published Dec 24, 2025 β’ 25
CoDA: Agentic Systems for Collaborative Data Visualization Paper β’ 2510.03194 β’ Published Oct 3, 2025 β’ 30
MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining Paper β’ 2505.07608 β’ Published May 12, 2025 β’ 82
A Comprehensive Survey on Long Context Language Modeling Paper β’ 2503.17407 β’ Published Mar 20, 2025 β’ 49
MPO: Boosting LLM Agents with Meta Plan Optimization Paper β’ 2503.02682 β’ Published Mar 4, 2025 β’ 29
view article Article Accelerating LLM Inference: Fast Sampling with Gumbel-Max Trick Oct 24, 2024 β’ 14
MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures Paper β’ 2410.13754 β’ Published Oct 17, 2024 β’ 76
Harnessing Webpage UIs for Text-Rich Visual Understanding Paper β’ 2410.13824 β’ Published Oct 17, 2024 β’ 30
LongEmbed: Extending Embedding Models for Long Context Retrieval Paper β’ 2404.12096 β’ Published Apr 18, 2024 β’ 3
PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training Paper β’ 2309.10400 β’ Published Sep 19, 2023 β’ 26