ArtHOI: Taming Foundation Models for Monocular 4D Reconstruction of Hand-Articulated-Object Interactions Paper • 2603.25791 • Published 6 days ago • 3
PoseDreamer: Scalable and Photorealistic Human Data Generation Pipeline with Diffusion Models Paper • 2603.28763 • Published 2 days ago • 4
MMFace-DiT: A Dual-Stream Diffusion Transformer for High-Fidelity Multimodal Face Generation Paper • 2603.29029 • Published 2 days ago • 5
Learn2Fold: Structured Origami Generation with World Model Planning Paper • 2603.29585 • Published Feb 2 • 5
VectorGym: A Multitask Benchmark for SVG Code Generation, Sketching, and Editing Paper • 2603.29852 • Published Feb 22 • 4
OptiMer: Optimal Distribution Vector Merging Is Better than Data Mixing for Continual Pre-Training Paper • 2603.28858 • Published 2 days ago • 5
FlowPIE: Test-Time Scientific Idea Evolution with Flow-Guided Literature Exploration Paper • 2603.29557 • Published 1 day ago • 11
CutClaw: Agentic Hours-Long Video Editing via Music Synchronization Paper • 2603.29664 • Published 1 day ago • 28
Unify-Agent: A Unified Multimodal Agent for World-Grounded Image Synthesis Paper • 2603.29620 • Published 1 day ago • 33
VGGRPO: Towards World-Consistent Video Generation with 4D Latent Reward Paper • 2603.26599 • Published 5 days ago • 43
GEMS: Agent-Native Multimodal Generation with Memory and Skills Paper • 2603.28088 • Published 2 days ago • 58
Lingshu-Cell: A generative cellular world model for transcriptome modeling toward virtual cells Paper • 2603.25240 • Published 6 days ago • 70
LongCat-Next: Lexicalizing Modalities as Discrete Tokens Paper • 2603.27538 • Published 3 days ago • 112
FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization Paper • 2603.19835 • Published 12 days ago • 285
EpochX: Building the Infrastructure for an Emergent Agent Civilization Paper • 2603.27304 • Published 4 days ago • 41