Spaces:
Running
Running
| title: README | |
| emoji: 🔥 | |
| colorFrom: blue | |
| colorTo: red | |
| sdk: static | |
| pinned: false | |
|  | |
| **Orsta** is a family of high-performance Vision-Language Models (7B–32B) trained on 8 diverse tasks including detection, grounding, math, and visual puzzles. These models are optimized for both **visual perception** and **reasoning**. They are trained using the [V-Triune](https://github.com/MiniMax-AI/One-RL-to-See-Them-All) framework, a unified RL system that streamlines multi-task learning across vision-language domains. Orsta delivers strong performance, with improvements ranging from **+2.1** to an impressive **+14.1** across its various 7B and 32B model variants, with performance benefits extending to a wide range of downstream tasks. | |
| Explore our models, tasks, and results in the [technical report](https://github.com/MiniMax-AI/One-RL-to-See-Them-All/blob/main/MiniMax-One-RL-to-See-Them-All-v250523.pdf). |