Bohan Zhai's picture

3 7 1

Bohan Zhai PRO

Borise

·

AI & ML interests

LLM, Audio, NLP, 3D vision, vision language

Recent Activity

liked a dataset 4 days ago

Borise/CaptionQA

commented on a paper 6 days ago

CaptionQA: Is Your Caption as Useful as the Image Itself?

upvoted an article 6 days ago

📌 Rethinking Multimodality from an Industry Perspective: Captioning Is Far More Important Than You Think

View all activity

Organizations

authored 5 papers 8 days ago

HallE-Switch: Rethinking and Controlling Object Existence Hallucinations in Large Vision Language Models for Detailed Caption

Paper • 2310.01779 • Published Oct 3, 2023 • 4

CORE-MM: Complex Open-Ended Reasoning Evaluation For Multi-Modal Large Language Models

Paper • 2311.11567 • Published Nov 20, 2023 • 8

Multitask Vision-Language Prompt Tuning

Paper • 2211.11720 • Published Nov 21, 2022 • 2

ExCoT: Optimizing Reasoning for Text-to-SQL with Execution Feedback

Paper • 2503.19988 • Published Mar 25

CaptionQA: Is Your Caption as Useful as the Image Itself?

Paper • 2511.21025 • Published 11 days ago • 25

authored a paper over 1 year ago

Law of Vision Representation in MLLMs

Paper • 2408.16357 • Published Aug 29, 2024 • 95

authored a paper almost 2 years ago

InfiMM-HD: A Leap Forward in High-Resolution Multimodal Understanding

Paper • 2403.01487 • Published Mar 3, 2024 • 16