ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration Paper • 2511.21689 • Published 11 days ago • 96
Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models Paper • 2511.18890 • Published 14 days ago • 29
RADIO Collection A collection of Foundation Vision Models that combine multiple models (CLIP, DINOv2, SAM, etc.). • 16 items • Updated 4 days ago • 26
view article Article NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks Aug 11 • 75
RADIO Amplified: Improved Baselines for Agglomerative Vision Foundation Models Paper • 2412.07679 • Published Dec 10, 2024
VILA-M3: Enhancing Vision-Language Models with Medical Expert Knowledge Paper • 2411.12915 • Published Nov 19, 2024
Minifinetuning: Low-Data Generation Domain Adaptation through Corrective Self-Distillation Paper • 2506.15702 • Published May 30