EditThinker: Unlocking Iterative Reasoning for Any Image Editor Paper • 2512.05965 • Published 4 days ago • 33
PosterCopilot: Toward Layout Reasoning and Controllable Editing for Professional Graphic Design Paper • 2512.04082 • Published 6 days ago • 12
Estimator Meets Equilibrium Perspective: A Rectified Straight Through Estimator for Binary Neural Networks Training Paper • 2308.06689 • Published Aug 13, 2023
SpatialDreamer: Self-supervised Stereo Video Synthesis from Monocular Input Paper • 2411.11934 • Published Nov 18, 2024
OneThinker: All-in-one Reasoning Model for Image and Video Paper • 2512.03043 • Published 7 days ago • 29
ShotBench: Expert-Level Cinematic Understanding in Vision-Language Models Paper • 2506.21356 • Published Jun 26 • 22
Uni-MMMU: A Massive Multi-discipline Multimodal Unified Benchmark Paper • 2510.13759 • Published Oct 15 • 9
VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic Faithfulness Paper • 2503.21755 • Published Mar 27 • 33
Architecture Decoupling Is Not All You Need For Unified Multimodal Model Paper • 2511.22663 • Published 12 days ago • 28
ExpVid: A Benchmark for Experiment Video Understanding & Reasoning Paper • 2510.11606 • Published Oct 13 • 3
VEU-Bench: Towards Comprehensive Understanding of Video Editing Paper • 2504.17828 • Published Apr 24
RynnVLA-002: A Unified Vision-Language-Action and World Model Paper • 2511.17502 • Published 18 days ago • 24
Uni-MMMU: A Massive Multi-discipline Multimodal Unified Benchmark Paper • 2510.13759 • Published Oct 15 • 9
Simulating the Visual World with Artificial Intelligence: A Roadmap Paper • 2511.08585 • Published 28 days ago • 29