Decoupled DMD: CFG Augmentation as the Spear, Distribution Matching as the Shield Paper • 2511.22677 • Published 9 days ago • 18
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer Paper • 2511.22699 • Published 9 days ago • 145
AnyTalker: Scaling Multi-Person Talking Video Generation with Interactivity Refinement Paper • 2511.23475 • Published 8 days ago • 41
WiseEdit: Benchmarking Cognition- and Creativity-Informed Image Editing Paper • 2512.00387 • Published 8 days ago • 2
REASONEDIT: Towards Reasoning-Enhanced Image Editing Models Paper • 2511.22625 • Published 9 days ago • 45
MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation Paper • 2511.09611 • Published 24 days ago • 68
Kimi-VL-A3B Collection Moonshot's efficient MoE VLMs, exceptional on agent, long-context, and thinking • 7 items • Updated Oct 30 • 77
A Style is Worth One Code: Unlocking Code-to-Style Image Generation with Discrete Style Space Paper • 2511.10555 • Published 23 days ago • 60
Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation Paper • 2511.14993 • Published 18 days ago • 222
Diversity Has Always Been There in Your Visual Autoregressive Models Paper • 2511.17074 • Published 15 days ago • 7
WorldGen: From Text to Traversable and Interactive 3D Worlds Paper • 2511.16825 • Published 16 days ago • 21
Toward the Frontiers of Reliable Diffusion Sampling via Adversarial Sinkhorn Attention Guidance Paper • 2511.07499 • Published 26 days ago • 5
Time-to-Move: Training-Free Motion Controlled Video Generation via Dual-Clock Denoising Paper • 2511.08633 • Published 27 days ago • 53
VideoSSR: Video Self-Supervised Reinforcement Learning Paper • 2511.06281 • Published 28 days ago • 24