ahmad21omar commited on
Commit
5a9eb8b
·
1 Parent(s): e1e81dc

update grid

Browse files
Files changed (1) hide show
  1. benchmark_data.csv +1 -1
benchmark_data.csv CHANGED
@@ -1,5 +1,5 @@
1
  Model,Logical Power Ranking,Logical Power Score,Accuracy,Syntax Score,Logic Basic Accuracy,Logic Easy Accuracy,Logic Medium Accuracy,Logic Hard Accuracy
2
- o3,1,15.4,0.77,0.8,0.99,0.93,0.74,0.43
3
  o4-mini-high,2,12.8,0.64,0.88,0.98,0.96,0.4,0.21
4
  o4-mini,3,12.3,0.61,0.86,0.93,0.88,0.52,0.13
5
  o1,4,11.9,0.59,0.68,0.92,0.89,0.41,0.15
 
1
  Model,Logical Power Ranking,Logical Power Score,Accuracy,Syntax Score,Logic Basic Accuracy,Logic Easy Accuracy,Logic Medium Accuracy,Logic Hard Accuracy
2
+ o3,1,15.5,0.78,0.8,0.99,0.93,0.74,0.45
3
  o4-mini-high,2,12.8,0.64,0.88,0.98,0.96,0.4,0.21
4
  o4-mini,3,12.3,0.61,0.86,0.93,0.88,0.52,0.13
5
  o1,4,11.9,0.59,0.68,0.92,0.89,0.41,0.15