Step-Controlled DPO - a MathGenie Collection

MathGenie 's Collections

Step-Controlled DPO

Step-Controlled DPO

updated Jul 5, 2024

Models and Datasets of Step-Controlled DPO.

MathGenie/InternLM2-SFT-SCDPO

Text Generation • 20B • Updated Jun 29, 2024 • 14 • 1
MathGenie/Mistral-7B-Ours-SFT

Text Generation • 7B • Updated Jul 3, 2024 • 6 • 1
MathGenie/Mistral-7B-Ours-SFT-SCDPO

Text Generation • 7B • Updated Jul 4, 2024 • 7 • 2
MathGenie/SCDPO-Data-Mistral-Ours

Viewer • Updated Jul 4, 2024 • 30.2k • 8 • 3
MathGenie/MATH-GSM8K-Tool-81K

Viewer • Updated Aug 27, 2024 • 81.1k • 31 • 2
MathGenie/DPO-Data-Mistral-Ours

Viewer • Updated Jul 5, 2024 • 12.2k • 6 • 1