Paused Agents 30 Open LLM Leaderboard for domains π 30 Ranking for Open-sourced LLMs in different domains
Running on CPU Upgrade Agents 245 MMLU-Pro Leaderboard π₯ 245 More advanced and challenging multi-task evaluation