1 7

Kaixin Ma

kaixinm

https://mayer123.github.io/

Mayer123

AI & ML interests

NLP, ML

Recent Activity

upvoted a paper 2 days ago

SO-Bench: A Structural Output Evaluation of Multimodal LLMs

upvoted a paper 12 days ago

NarrativeTrack: Evaluating Video Language Models Beyond the Frame

upvoted a paper over 1 year ago

LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory

View all activity

Organizations

None yet

upvoted a paper 2 days ago

SO-Bench: A Structural Output Evaluation of Multimodal LLMs

Paper • 2511.21750 • Published Nov 23, 2025 • 6

upvoted a paper 12 days ago

NarrativeTrack: Evaluating Video Language Models Beyond the Frame

Paper • 2601.01095 • Published 15 days ago • 6

upvoted 2 papers over 1 year ago

LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory

Paper • 2410.10813 • Published Oct 14, 2024 • 14

LEOPARD : A Vision Language Model For Text-Rich Multi-Image Tasks

Paper • 2410.01744 • Published Oct 2, 2024 • 26

authored 2 papers over 1 year ago

LEOPARD : A Vision Language Model For Text-Rich Multi-Image Tasks

Paper • 2410.01744 • Published Oct 2, 2024 • 26

DSBench: How Far Are Data Science Agents to Becoming Data Science Experts?

Paper • 2409.07703 • Published Sep 12, 2024 • 67

upvoted 2 papers over 1 year ago

DSBench: How Far Are Data Science Agents to Becoming Data Science Experts?

Paper • 2409.07703 • Published Sep 12, 2024 • 67

Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22, 2024 • 133

authored a paper almost 2 years ago

WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models

Paper • 2401.13919 • Published Jan 25, 2024 • 32

upvoted a paper over 2 years ago

LASER: LLM Agent with State-Space Exploration for Web Navigation

Paper • 2309.08172 • Published Sep 15, 2023 • 13

authored a paper over 2 years ago