Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs Paper • 2503.05139 • Published Mar 7, 2025 • 5
LLaDA2.0: Scaling Up Diffusion Language Models to 100B Paper • 2512.15745 • Published 27 days ago • 78
FunReason-MT Technical Report: Overcoming the Complexity Barrier in Multi-Turn Function Calling Paper • 2510.24645 • Published Oct 28, 2025 • 8
TwinFlow: Realizing One-step Generation on Large Models with Self-adversarial Flows Paper • 2512.05150 • Published Dec 3, 2025 • 74
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper • 2512.02556 • Published Dec 2, 2025 • 244
AutoEnv: Automated Environments for Measuring Cross-Environment Agent Learning Paper • 2511.19304 • Published Nov 24, 2025 • 90
GroupRank: A Groupwise Reranking Paradigm Driven by Reinforcement Learning Paper • 2511.11653 • Published Nov 10, 2025 • 55
Ming-UniAudio: Speech LLM for Joint Understanding, Generation and Editing with Unified Representation Paper • 2511.05516 • Published Oct 26, 2025 • 10
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations Paper • 2504.10481 • Published Apr 14, 2025 • 85
Every Activation Boosted: Scaling General Reasoner to 1 Trillion Open Language Foundation Paper • 2510.22115 • Published Oct 25, 2025 • 83
Ming-Flash-Omni: A Sparse, Unified Architecture for Multimodal Perception and Generation Paper • 2510.24821 • Published Oct 28, 2025 • 38
Every Attention Matters: An Efficient Hybrid Architecture for Long-Context Reasoning Paper • 2510.19338 • Published Oct 22, 2025 • 114
Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model Paper • 2510.18855 • Published Oct 21, 2025 • 71
ACEBench: Who Wins the Match Point in Tool Usage? Paper • 2501.12851 • Published Jan 22, 2025 • 3
view article Article Ring-flash-linear-2.0: A Highly Efficient Hybrid Architecture for Test-Time Scaling Oct 9, 2025 • 11