LATTICE: Democratize High-Fidelity 3D Generation at Scale Paper • 2512.03052 • Published 14 days ago • 7
SIMA 2: A Generalist Embodied Agent for Virtual Worlds Paper • 2512.04797 • Published 3 days ago • 12
NeuralRemaster: Phase-Preserving Diffusion for Structure-Aligned Generation Paper • 2512.05106 • Published 3 days ago • 11
Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length Paper • 2512.04677 • Published 4 days ago • 139
Video Generation Models Are Good Latent Reward Models Paper • 2511.21541 • Published 11 days ago • 45
GeoVista: Web-Augmented Agentic Visual Reasoning for Geolocalization Paper • 2511.15705 • Published 18 days ago • 91
Reasoning via Video: The First Evaluation of Video Models' Reasoning Abilities through Maze-Solving Tasks Paper • 2511.15065 • Published 19 days ago • 74
A Style is Worth One Code: Unlocking Code-to-Style Image Generation with Discrete Style Space Paper • 2511.10555 • Published 24 days ago • 60
Part-X-MLLM: Part-aware 3D Multimodal Large Language Model Paper • 2511.13647 • Published 20 days ago • 70
Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds Paper • 2511.08892 • Published 26 days ago • 194
Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding Paper • 2510.06308 • Published Oct 7 • 53
CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement Learning Paper • 2509.22647 • Published Sep 26 • 32
Hunyuan3D-Omni: A Unified Framework for Controllable Generation of 3D Assets Paper • 2509.21245 • Published Sep 25 • 38
MiniCPM-V 4.5: Cooking Efficient MLLMs via Architecture, Data, and Training Recipe Paper • 2509.18154 • Published Sep 16 • 51
Lyra: Generative 3D Scene Reconstruction via Video Diffusion Model Self-Distillation Paper • 2509.19296 • Published Sep 23 • 23
VideoFrom3D: 3D Scene Video Generation via Complementary Image and Video Diffusion Models Paper • 2509.17985 • Published Sep 22 • 26