AstroReason-Bench: Evaluating Unified Agentic Planning across Heterogeneous Space Planning Problems Paper • 2601.11354 • Published 21 days ago • 4
LSRIF: Logic-Structured Reinforcement Learning for Instruction Following Paper • 2601.06431 • Published 28 days ago • 12
Hunyuan-TurboS: Advancing Large Language Models through Mamba-Transformer Synergy and Adaptive Chain-of-Thought Paper • 2505.15431 • Published May 21, 2025 • 1
Pre-Trained Policy Discriminators are General Reward Models Paper • 2507.05197 • Published Jul 7, 2025 • 39
Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles Paper • 2505.19914 • Published May 26, 2025 • 46
TimeArena: Shaping Efficient Multitasking Language Agents in a Time-Aware Simulation Paper • 2402.05733 • Published Feb 8, 2024
SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals Paper • 2406.04784 • Published Jun 7, 2024 • 2
DAPO: An Open-Source LLM Reinforcement Learning System at Scale Paper • 2503.14476 • Published Mar 18, 2025 • 144
Seed1.5-Thinking: Advancing Superb Reasoning Models with Reinforcement Learning Paper • 2504.13914 • Published Apr 10, 2025 • 4
Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles Paper • 2505.19914 • Published May 26, 2025 • 46
A Multi-Dimensional Constraint Framework for Evaluating and Improving Instruction Following in Large Language Models Paper • 2505.07591 • Published May 12, 2025 • 11
BookWorld: From Novels to Interactive Agent Societies for Creative Story Generation Paper • 2504.14538 • Published Apr 20, 2025 • 30
TransMamba: Flexibly Switching between Transformer and Mamba Paper • 2503.24067 • Published Mar 31, 2025 • 21
Implicit Reasoning in Transformers is Reasoning through Shortcuts Paper • 2503.07604 • Published Mar 10, 2025 • 23
Implicit Reasoning in Transformers is Reasoning through Shortcuts Paper • 2503.07604 • Published Mar 10, 2025 • 23
CoSER: Coordinating LLM-Based Persona Simulation of Established Roles Paper • 2502.09082 • Published Feb 13, 2025 • 30
CoSER: Coordinating LLM-Based Persona Simulation of Established Roles Paper • 2502.09082 • Published Feb 13, 2025 • 30