-
Stand-In: A Lightweight and Plug-and-Play Identity Control for Video Generation
Paper • 2508.07901 • Published • 40 -
CharacterShot: Controllable and Consistent 4D Character Animation
Paper • 2508.07409 • Published • 39 -
Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels
Paper • 2508.17437 • Published • 37
Collections
Discover the best community collections!
Collections including paper arxiv:2508.17437
-
MeshCoder: LLM-Powered Structured Mesh Code Generation from Point Clouds
Paper • 2508.14879 • Published • 68 -
VoxHammer: Training-Free Precise and Coherent 3D Editing in Native 3D Space
Paper • 2508.19247 • Published • 42 -
Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels
Paper • 2508.17437 • Published • 37 -
Multi-View 3D Point Tracking
Paper • 2508.21060 • Published • 23
-
Reinforcement Learning in Vision: A Survey
Paper • 2508.08189 • Published • 29 -
Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels
Paper • 2508.17437 • Published • 37 -
Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation
Paper • 2509.00428 • Published • 17 -
F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions
Paper • 2509.06951 • Published • 31
-
4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models
Paper • 2503.10437 • Published • 33 -
Open-Sora 2.0: Training a Commercial-Level Video Generation Model in $200k
Paper • 2503.09642 • Published • 19 -
VGGT: Visual Geometry Grounded Transformer
Paper • 2503.11651 • Published • 34 -
1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering
Paper • 2503.16422 • Published • 14
-
Omni-Effects: Unified and Spatially-Controllable Visual Effects Generation
Paper • 2508.07981 • Published • 58 -
CharacterShot: Controllable and Consistent 4D Character Animation
Paper • 2508.07409 • Published • 39 -
ToonComposer: Streamlining Cartoon Production with Generative Post-Keyframing
Paper • 2508.10881 • Published • 52 -
Puppeteer: Rig and Animate Your 3D Models
Paper • 2508.10898 • Published • 33
-
RAISECity: A Multimodal Agent Framework for Reality-Aligned 3D World Generation at City-Scale
Paper • 2511.18005 • Published • 1 -
SynCity: Training-Free Generation of 3D Worlds
Paper • 2503.16420 • Published • 27 -
CityDreamer4D: Compositional Generative Model of Unbounded 4D Cities
Paper • 2501.08983 • Published • 22 -
WorldGrow: Generating Infinite 3D World
Paper • 2510.21682 • Published • 42
-
TextureDreamer: Image-guided Texture Synthesis through Geometry-aware Diffusion
Paper • 2401.09416 • Published • 11 -
SHINOBI: Shape and Illumination using Neural Object Decomposition via BRDF Optimization In-the-wild
Paper • 2401.10171 • Published • 14 -
DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model
Paper • 2311.09217 • Published • 22 -
GALA: Generating Animatable Layered Assets from a Single Scan
Paper • 2401.12979 • Published • 9
-
Stand-In: A Lightweight and Plug-and-Play Identity Control for Video Generation
Paper • 2508.07901 • Published • 40 -
CharacterShot: Controllable and Consistent 4D Character Animation
Paper • 2508.07409 • Published • 39 -
Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels
Paper • 2508.17437 • Published • 37
-
MeshCoder: LLM-Powered Structured Mesh Code Generation from Point Clouds
Paper • 2508.14879 • Published • 68 -
VoxHammer: Training-Free Precise and Coherent 3D Editing in Native 3D Space
Paper • 2508.19247 • Published • 42 -
Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels
Paper • 2508.17437 • Published • 37 -
Multi-View 3D Point Tracking
Paper • 2508.21060 • Published • 23
-
Omni-Effects: Unified and Spatially-Controllable Visual Effects Generation
Paper • 2508.07981 • Published • 58 -
CharacterShot: Controllable and Consistent 4D Character Animation
Paper • 2508.07409 • Published • 39 -
ToonComposer: Streamlining Cartoon Production with Generative Post-Keyframing
Paper • 2508.10881 • Published • 52 -
Puppeteer: Rig and Animate Your 3D Models
Paper • 2508.10898 • Published • 33
-
Reinforcement Learning in Vision: A Survey
Paper • 2508.08189 • Published • 29 -
Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels
Paper • 2508.17437 • Published • 37 -
Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation
Paper • 2509.00428 • Published • 17 -
F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions
Paper • 2509.06951 • Published • 31
-
RAISECity: A Multimodal Agent Framework for Reality-Aligned 3D World Generation at City-Scale
Paper • 2511.18005 • Published • 1 -
SynCity: Training-Free Generation of 3D Worlds
Paper • 2503.16420 • Published • 27 -
CityDreamer4D: Compositional Generative Model of Unbounded 4D Cities
Paper • 2501.08983 • Published • 22 -
WorldGrow: Generating Infinite 3D World
Paper • 2510.21682 • Published • 42
-
4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models
Paper • 2503.10437 • Published • 33 -
Open-Sora 2.0: Training a Commercial-Level Video Generation Model in $200k
Paper • 2503.09642 • Published • 19 -
VGGT: Visual Geometry Grounded Transformer
Paper • 2503.11651 • Published • 34 -
1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering
Paper • 2503.16422 • Published • 14
-
TextureDreamer: Image-guided Texture Synthesis through Geometry-aware Diffusion
Paper • 2401.09416 • Published • 11 -
SHINOBI: Shape and Illumination using Neural Object Decomposition via BRDF Optimization In-the-wild
Paper • 2401.10171 • Published • 14 -
DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model
Paper • 2311.09217 • Published • 22 -
GALA: Generating Animatable Layered Assets from a Single Scan
Paper • 2401.12979 • Published • 9