Spaces:
Sleeping
Sleeping
Commit ·
5a9eb8b
1
Parent(s): e1e81dc
update grid
Browse files- benchmark_data.csv +1 -1
benchmark_data.csv
CHANGED
|
@@ -1,5 +1,5 @@
|
|
| 1 |
Model,Logical Power Ranking,Logical Power Score,Accuracy,Syntax Score,Logic Basic Accuracy,Logic Easy Accuracy,Logic Medium Accuracy,Logic Hard Accuracy
|
| 2 |
-
o3,1,15.
|
| 3 |
o4-mini-high,2,12.8,0.64,0.88,0.98,0.96,0.4,0.21
|
| 4 |
o4-mini,3,12.3,0.61,0.86,0.93,0.88,0.52,0.13
|
| 5 |
o1,4,11.9,0.59,0.68,0.92,0.89,0.41,0.15
|
|
|
|
| 1 |
Model,Logical Power Ranking,Logical Power Score,Accuracy,Syntax Score,Logic Basic Accuracy,Logic Easy Accuracy,Logic Medium Accuracy,Logic Hard Accuracy
|
| 2 |
+
o3,1,15.5,0.78,0.8,0.99,0.93,0.74,0.45
|
| 3 |
o4-mini-high,2,12.8,0.64,0.88,0.98,0.96,0.4,0.21
|
| 4 |
o4-mini,3,12.3,0.61,0.86,0.93,0.88,0.52,0.13
|
| 5 |
o1,4,11.9,0.59,0.68,0.92,0.89,0.41,0.15
|