Spaces:

One-RL-to-See-Them-All
/

README

Running

App Files Files Community

README / README.md

ManTle's picture

Update README.md

d7d197f verified 8 months ago

|

history blame contribute delete

945 Bytes

	---
	title: README
	emoji: 🔥
	colorFrom: blue
	colorTo: red
	sdk: static
	pinned: false
	---
	![main-figure](main-figure.png)
	Orsta is a family of high-performance Vision-Language Models (7B–32B) trained on 8 diverse tasks including detection, grounding, math, and visual puzzles. These models are optimized for both visual perception and reasoning. They are trained using the [V-Triune](https://github.com/MiniMax-AI/One-RL-to-See-Them-All) framework, a unified RL system that streamlines multi-task learning across vision-language domains. Orsta delivers strong performance, with improvements ranging from +2.1 to an impressive +14.1 across its various 7B and 32B model variants, with performance benefits extending to a wide range of downstream tasks.

	Explore our models, tasks, and results in the [technical report](https://github.com/MiniMax-AI/One-RL-to-See-Them-All/blob/main/MiniMax-One-RL-to-See-Them-All-v250523.pdf).