Avatar V: Scaling Video-Reference Avatar Video Generation Paper • 2606.13872 • Published 11 days ago • 9
PermaVid: Consistent Video Generation Across Edits via Disentangled Context Memory Paper • 2606.16449 • Published 7 days ago • 5
World Tracing: Generative Pixel-Aligned Geometry Beyond the Visible Paper • 2606.13652 • Published 11 days ago • 13
Track2View: 4D-Consistent Camera-Controlled Video Generation via Paired 3D Point Tracks Paper • 2606.15534 • Published 8 days ago • 11
Memento: Reconstruct to Remember for Consistent Long Video Generation Paper • 2606.14667 • Published 10 days ago • 16
DreamX-World 1.0: A General-Purpose Interactive World Model Paper • 2606.16993 • Published 7 days ago • 106
Humanoid-GPT: Scaling Data and Structure for Zero-Shot Motion Tracking Paper • 2606.03985 • Published 20 days ago • 41
TideGS: Scalable Training of Over One Billion 3D Gaussian Splatting Primitives via Out-of-Core Optimization Paper • 2605.20150 • Published May 19 • 7
Fast 4D Mesh Generation by Spatio-Temporal Attention Chains Paper • 2605.19786 • Published May 19 • 11
VidSplat: Gaussian Splatting Reconstruction with Geometry-Guided Video Diffusion Priors Paper • 2605.11424 • Published May 12 • 4
MoCam: Unified Novel View Synthesis via Structured Denoising Dynamics Paper • 2605.12119 • Published May 12 • 2
CausalCine: Real-Time Autoregressive Generation for Multi-Shot Video Narratives Paper • 2605.12496 • Published May 12 • 29
RADIO-ViPE: Online Tightly Coupled Multi-Modal Fusion for Open-Vocabulary Semantic SLAM in Dynamic Environments Paper • 2604.26067 • Published Apr 28 • 75
Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation Paper • 2604.24763 • Published Apr 27 • 71
World-R1: Reinforcing 3D Constraints for Text-to-Video Generation Paper • 2604.24764 • Published Apr 27 • 119
EditCrafter: Tuning-free High-Resolution Image Editing via Pretrained Diffusion Model Paper • 2604.10268 • Published Apr 11 • 12
Hierarchical Codec Diffusion for Video-to-Speech Generation Paper • 2604.15923 • Published Apr 17 • 2
HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds Paper • 2604.14268 • Published Apr 15 • 126
ReconPhys: Reconstruct Appearance and Physical Attributes from Single Video Paper • 2604.07882 • Published Apr 9 • 9