HallE-Switch: Rethinking and Controlling Object Existence Hallucinations in Large Vision Language Models for Detailed Caption Paper • 2310.01779 • Published Oct 3, 2023 • 4
CORE-MM: Complex Open-Ended Reasoning Evaluation For Multi-Modal Large Language Models Paper • 2311.11567 • Published Nov 20, 2023 • 8
ExCoT: Optimizing Reasoning for Text-to-SQL with Execution Feedback Paper • 2503.19988 • Published Mar 25
CaptionQA: Is Your Caption as Useful as the Image Itself? Paper • 2511.21025 • Published 11 days ago • 25
InfiMM-HD: A Leap Forward in High-Resolution Multimodal Understanding Paper • 2403.01487 • Published Mar 3, 2024 • 16