lmms-lab-encoder/LLaVA-OneVision-2-8B-Instruct Image-Text-to-Text • 9B • Updated 25 days ago • 17.7k • 13
From Pixels to Words -- Towards Native One-Vision Models at Scale Paper • 2605.28820 • Published May 27 • 75
ParaVT: Taming the Tool Prior Paradox for Parallel Tool Use in Agentic Video Reinforcement Learning Paper • 2605.20342 • Published May 19 • 34
lmms-lab-encoder/LLaVA-OneVision-2-8B-Instruct Image-Text-to-Text • 9B • Updated 25 days ago • 17.7k • 13
ov2-1/date0511-LLaVA-OneVision-2-4B-p16m33-mcore-tp1-pp1-stage1-alignment-adapter-only Updated May 18
ov2-1/date0511-LLaVA-OneVision-2-4B-p16m33-mcore-tp1-pp1-stage1-alignment-adapter-only Updated May 18
FileGram: Grounding Agent Personalization in File-System Behavioral Traces Paper • 2604.04901 • Published Apr 6 • 40
view article Article NEO-unify: Building Native Multimodal Unified Models End to End sensenova • Mar 5 • 167