Instruction-Following Evaluation for Large Language Models Paper • 2311.07911 • Published Nov 14, 2023 • 22
vectara/hallucination_evaluation_model Text Classification • 0.1B • Updated Oct 20, 2025 • 138k • 354
A Survey on Evaluation of Large Language Models Paper • 2307.03109 • Published Jul 6, 2023 • 43
Runtime error Agents Featured 436 Open Medical-LLM Leaderboard 🥇 436 Explore and submit models for benchmarking
Running on CPU Upgrade Agents 76 La Leaderboard 🌸 76 Evaluate open LLMs in the languages of LATAM and Spain.