1 17 3

Chenming Zhu

ChaimZhu

https://zcmax.github.io/

AI & ML interests

Multimodal Large Language Models, 3D Perception and Understanding, Embodied AI

Recent Activity

upvoted a paper 13 days ago

MMSI-Video-Bench: A Holistic Benchmark for Video-Based Spatial Intelligence

upvoted a paper 20 days ago

Ground Slow, Move Fast: A Dual-System Foundation Model for Generalizable Vision-and-Language Navigation

upvoted a paper about 1 month ago

G^2VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning

View all activity

Organizations

upvoted a paper 13 days ago

MMSI-Video-Bench: A Holistic Benchmark for Video-Based Spatial Intelligence

Paper • 2512.10863 • Published 20 days ago • 21

upvoted a paper 20 days ago

Ground Slow, Move Fast: A Dual-System Foundation Model for Generalizable Vision-and-Language Navigation

Paper • 2512.08186 • Published 22 days ago • 21

upvoted a paper about 1 month ago

G^2VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning

Paper • 2511.21688 • Published Nov 26 • 8

updated a model about 1 month ago

InternRobotics/InternVLA-N1-wo-dagger

Robotics • 8B • Updated Nov 25 • 627 • 38

upvoted a paper 3 months ago

SANA-Video: Efficient Video Generation with Block Linear Diffusion Transformer

Paper • 2509.24695 • Published Sep 29 • 44

updated a model 4 months ago

InternRobotics/InternVLA-N1-Preview

Robotics • 8B • Updated Sep 1 • 2 • 6

published a model 4 months ago

InternRobotics/InternVLA-N1-wo-dagger

Robotics • 8B • Updated Nov 25 • 627 • 38

upvoted a paper 4 months ago

T2I-ReasonBench: Benchmarking Reasoning-Informed Text-to-Image Generation

Paper • 2508.17472 • Published Aug 24 • 26

liked a dataset 4 months ago

jasonzhango/SPAR-7M-RGBD

Updated Jun 15 • 447 • 7

updated a model 5 months ago

InternRobotics/InternVLA-N1-System2-wo-dagger

8B • Updated Jul 28 • 71 • 1

published a model 5 months ago

InternRobotics/InternVLA-N1-System2-wo-dagger

8B • Updated Jul 28 • 71 • 1

liked a model 5 months ago

moonshotai/Kimi-K2-Instruct

Text Generation • 1T • Updated Nov 7 • 65.9k • • 2.29k

authored a paper 6 months ago

OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding

Paper • 2507.07984 • Published Jul 10 • 42

updated a dataset 6 months ago

ChaimZhu/LLaVA-3D-Data

Viewer • Updated Jul 11 • 859k • 81

published a dataset 6 months ago

ChaimZhu/LLaVA-3D-Data

Viewer • Updated Jul 11 • 859k • 81

upvoted a paper 6 months ago

OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding

Paper • 2507.07984 • Published Jul 10 • 42

commented a paper 6 months ago

OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding

Paper • 2507.07984 • Published Jul 10 • 42 •

liked a dataset 6 months ago

rbler/OST-Bench

Updated Nov 28 • 203 • 4

upvoted 2 papers 6 months ago

StreamVLN: Streaming Vision-and-Language Navigation via SlowFast Context Modeling

Paper • 2507.05240 • Published Jul 7 • 47

OmniPart: Part-Aware 3D Generation with Semantic Decoupling and Structural Cohesion

Paper • 2507.06165 • Published Jul 8 • 58

Chenming Zhu

AI & ML interests

Recent Activity

Organizations

ChaimZhu's activity