PERK: Long-Context Reasoning as Parameter-Efficient Test-Time Learning Paper β’ 2507.06415 β’ Published Jul 8, 2025 β’ 7
AutoTriton: Automatic Triton Programming with Reinforcement Learning in LLMs Paper β’ 2507.05687 β’ Published Jul 8, 2025 β’ 31
4KAgent: Agentic Any Image to 4K Super-Resolution Paper β’ 2507.07105 β’ Published Jul 9, 2025 β’ 107
A Survey on Long-Video Storytelling Generation: Architectures, Consistency, and Cinematic Quality Paper β’ 2507.07202 β’ Published Jul 9, 2025 β’ 25
LangSplatV2: High-dimensional 3D Language Gaussian Splatting with 450+ FPS Paper β’ 2507.07136 β’ Published Jul 9, 2025 β’ 40
OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding Paper β’ 2507.07984 β’ Published Jul 10, 2025 β’ 43
Multi-Granular Spatio-Temporal Token Merging for Training-Free Acceleration of Video LLMs Paper β’ 2507.07990 β’ Published Jul 10, 2025 β’ 46
RIG: Synergizing Reasoning and Imagination in End-to-End Generalist Policy Paper β’ 2503.24388 β’ Published Mar 31, 2025 β’ 29
TokenHSI: Unified Synthesis of Physical Human-Scene Interactions through Task Tokenization Paper β’ 2503.19901 β’ Published Mar 25, 2025 β’ 41
Efficient Inference for Large Reasoning Models: A Survey Paper β’ 2503.23077 β’ Published Mar 29, 2025 β’ 46
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model Paper β’ 2503.24290 β’ Published Mar 31, 2025 β’ 62
TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenes Paper β’ 2503.23461 β’ Published Mar 30, 2025 β’ 94
MoCha: Towards Movie-Grade Talking Character Synthesis Paper β’ 2503.23307 β’ Published Mar 30, 2025 β’ 141
Scaling Language-Free Visual Representation Learning Paper β’ 2504.01017 β’ Published Apr 1, 2025 β’ 33
Command A: An Enterprise-Ready Large Language Model Paper β’ 2504.00698 β’ Published Apr 1, 2025 β’ 29