Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning Paper • 2601.06943 • Published 3 days ago • 184
OS-Symphony: A Holistic Framework for Robust and Generalist Computer-Using Agent Paper • 2601.07779 • Published 1 day ago • 24
VideoAR: Autoregressive Video Generation via Next-Frame & Scale Prediction Paper • 2601.05966 • Published 5 days ago • 21
NitroGen: An Open Foundation Model for Generalist Gaming Agents Paper • 2601.02427 • Published 10 days ago • 37
KV-Embedding: Training-free Text Embedding via Internal KV Re-routing in Decoder-only LLMs Paper • 2601.01046 • Published 11 days ago • 11
Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation Paper • 2601.00664 • Published 12 days ago • 51
FlowBlending: Stage-Aware Multi-Model Sampling for Fast and High-Fidelity Video Generation Paper • 2512.24724 • Published 14 days ago • 6
InsertAnywhere: Bridging 4D Scene Geometry and Diffusion Models for Realistic Video Object Insertion Paper • 2512.17504 • Published 26 days ago • 96
Mindscape-Aware Retrieval Augmented Generation for Improved Long Context Understanding Paper • 2512.17220 • Published 26 days ago • 111
Spatia: Video Generation with Updatable Spatial Memory Paper • 2512.15716 • Published 28 days ago • 31
Learning from Next-Frame Prediction: Autoregressive Video Modeling Encodes Effective Representations Paper • 2512.21004 • Published 21 days ago • 12
StoryMem: Multi-shot Long Video Storytelling with Memory Paper • 2512.19539 • Published 23 days ago • 17
Region-Constraint In-Context Generation for Instructional Video Editing Paper • 2512.17650 • Published 26 days ago • 50
IC-Effect: Precise and Efficient Video Effects Editing via In-Context Learning Paper • 2512.15635 • Published 28 days ago • 19
Qwen-Image-Layered: Towards Inherent Editability via Layer Decomposition Paper • 2512.15603 • Published 28 days ago • 60