Trained Verifier Models Surrogate code verifiers across three model sizes trained using multiple different algorithms as described in the Aletheia paper Aletheia-Bench/GRPO-Think-1.5B-16k Text Generation • 2B • Updated Oct 30, 2025 • 25 Aletheia-Bench/GRPO-Think-7B-16k Text Generation • 8B • Updated Oct 30, 2025 • 33 Aletheia-Bench/GRPO-Think-14B-16k Text Generation • 15B • Updated Nov 3, 2025 • 58 Aletheia-Bench/GRPO-Think-1.5B-4k Text Generation • 2B • Updated Dec 2, 2025 • 24
Aletheia Datasets The datasets used in the Aletheia paper Aletheia-Bench/Aletheia-Train Viewer • Updated 10 days ago • 50k • 18 Aletheia-Bench/Aletheia-DPO Viewer • Updated 10 days ago • 50k • 16 Aletheia-Bench/Aletheia-Heldout Viewer • Updated 10 days ago • 33.3k • 21 Aletheia-Bench/Aletheia-Strong Viewer • Updated 10 days ago • 57.3k • 22
Trained Verifier Models Surrogate code verifiers across three model sizes trained using multiple different algorithms as described in the Aletheia paper Aletheia-Bench/GRPO-Think-1.5B-16k Text Generation • 2B • Updated Oct 30, 2025 • 25 Aletheia-Bench/GRPO-Think-7B-16k Text Generation • 8B • Updated Oct 30, 2025 • 33 Aletheia-Bench/GRPO-Think-14B-16k Text Generation • 15B • Updated Nov 3, 2025 • 58 Aletheia-Bench/GRPO-Think-1.5B-4k Text Generation • 2B • Updated Dec 2, 2025 • 24
Aletheia Datasets The datasets used in the Aletheia paper Aletheia-Bench/Aletheia-Train Viewer • Updated 10 days ago • 50k • 18 Aletheia-Bench/Aletheia-DPO Viewer • Updated 10 days ago • 50k • 16 Aletheia-Bench/Aletheia-Heldout Viewer • Updated 10 days ago • 33.3k • 21 Aletheia-Bench/Aletheia-Strong Viewer • Updated 10 days ago • 57.3k • 22