Rethinking Prompt Design for Inference-time Scaling in Text-to-Visual Generation Paper • 2512.03534 • Published 5 days ago • 18
PixelDiT: Pixel Diffusion Transformers for Image Generation Paper • 2511.20645 • Published 12 days ago • 25
MultiShotMaster: A Controllable Multi-Shot Video Generation Framework Paper • 2512.03041 • Published 5 days ago • 57
ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration Paper • 2511.21689 • Published 11 days ago • 96
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper • 2512.02556 • Published 6 days ago • 172
Infinity-RoPE: Action-Controllable Infinite Video Generation Emerges From Autoregressive Self-Rollout Paper • 2511.20649 • Published 12 days ago • 43
First Frame Is the Place to Go for Video Content Customization Paper • 2511.15700 • Published 18 days ago • 52
A Style is Worth One Code: Unlocking Code-to-Style Image Generation with Discrete Style Space Paper • 2511.10555 • Published 24 days ago • 60
MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation Paper • 2511.09611 • Published 25 days ago • 68
One Small Step in Latent, One Giant Leap for Pixels: Fast Latent Upscale Adapter for Your Diffusion Models Paper • 2511.10629 • Published 24 days ago • 122
Depth Anything 3: Recovering the Visual Space from Any Views Paper • 2511.10647 • Published 24 days ago • 92
Time-to-Move: Training-Free Motion Controlled Video Generation via Dual-Clock Denoising Paper • 2511.08633 • Published 28 days ago • 53
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B Paper • 2511.06221 • Published 29 days ago • 128