Glance: Accelerating Diffusion Models with 1 Sample Paper ⢠2512.02899 ⢠Published 8 days ago ⢠25
WEAVE: Unleashing and Benchmarking the In-context Interleaved Comprehension and Generation Paper ⢠2511.11434 ⢠Published 26 days ago ⢠44
š± Sailor2 Language Models Collection Sailing in South-East Asia with Inclusive Multilingual LLMs ⢠34 items ⢠Updated 21 days ago ⢠30
VCode: a Multimodal Coding Benchmark with SVG as Symbolic Visual Representation Paper ⢠2511.02778 ⢠Published Nov 4 ⢠101
UniLumos: Fast and Unified Image and Video Relighting with Physics-Plausible Feedback Paper ⢠2511.01678 ⢠Published Nov 3 ⢠34
From Charts to Code: A Hierarchical Benchmark for Multimodal Models Paper ⢠2510.17932 ⢠Published Oct 20 ⢠7
Paper2Video: Automatic Video Generation from Scientific Papers Paper ⢠2510.05096 ⢠Published Oct 6 ⢠117
V-MAGE: A Game Evaluation Framework for Assessing Visual-Centric Capabilities in Multimodal Large Language Models Paper ⢠2504.06148 ⢠Published Apr 8 ⢠13
V-MAGE: A Game Evaluation Framework for Assessing Visual-Centric Capabilities in Multimodal Large Language Models Paper ⢠2504.06148 ⢠Published Apr 8 ⢠13
V-MAGE: A Game Evaluation Framework for Assessing Visual-Centric Capabilities in Multimodal Large Language Models Paper ⢠2504.06148 ⢠Published Apr 8 ⢠13 ⢠2
Beyond Words: Advancing Long-Text Image Generation via Multimodal Autoregressive Models Paper ⢠2503.20198 ⢠Published Mar 26 ⢠4
Beyond Words: Advancing Long-Text Image Generation via Multimodal Autoregressive Models Paper ⢠2503.20198 ⢠Published Mar 26 ⢠4
Beyond Words: Advancing Long-Text Image Generation via Multimodal Autoregressive Models Paper ⢠2503.20198 ⢠Published Mar 26 ⢠4 ⢠3