Submitted by shenzhi-wang 52 HopChain: Multi-Hop Data Synthesis for Generalizable Vision-Language Reasoning Qwen 2
Submitted by Franklinzhang 14 Astrolabe: Steering Forward-Process Reinforcement Learning for Distilled Autoregressive Video Models · 9 authors 31 1
Submitted by JacobYuan 9 LumosX: Relate Any Identities with Their Attributes for Personalized Video Generation DAMO Academy 1
Submitted by JusperLee 4 BEAVER: A Training-Free Hierarchical Prompt Compression Method via Structure-Aware Page Selection Tsinghua University 2 1
Submitted by taesiri 3 WorldAgents: Can Foundation Image Models be Agents for 3D World Models? · 3 authors 1
Submitted by taesiri 2 Beyond Single Tokens: Distilling Discrete Diffusion Models via Discrete MMD Deepmind 1
Submitted by isminoula 1 EgoForge: Goal-Directed Egocentric World Simulator Perception and LANguage Lab 1
Submitted by yangzhifei 1 FlowScene: Style-Consistent Indoor Scene Generation with Multimodal Graph Rectified Flow · 9 authors 1
Submitted by lainmn 1 AgentDS Technical Report: Benchmarking the Future of Human-AI Collaboration in Domain-Specific Data Science · 15 authors 1
Submitted by bingo123122121 1 Language on Demand, Knowledge at Core: Composing LLMs with Encoder-Decoder Translation Models for Extensible Multilinguality · 2 authors 0 1