davidlms commited on
Commit
c143b9c
·
verified ·
1 Parent(s): 3aabb07

Add model-index with benchmark evaluations

Browse files

Added structured evaluation results from benchmark image:
- SimpleQA: 8.90
- MUSR: 63.49
- MMLU (Zero Shot): 84.95
- Math-500: 92.10
- GPQA-Diamond: 58.55
- BFCL V3: 59.67

This enables the model to appear in leaderboards and makes it easier to compare with other models.

Files changed (1) hide show
  1. README.md +30 -0
README.md CHANGED
@@ -15,6 +15,36 @@ language:
15
  library_name: transformers
16
  base_model:
17
  - arcee-ai/Trinity-Mini-Base
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
  ---
19
  <div align="center">
20
  <picture>
 
15
  library_name: transformers
16
  base_model:
17
  - arcee-ai/Trinity-Mini-Base
18
+ model-index:
19
+ - name: Trinity-Mini
20
+ results:
21
+ - task:
22
+ type: text-generation
23
+ dataset:
24
+ name: Benchmarks
25
+ type: benchmark
26
+ metrics:
27
+ - name: SimpleQA
28
+ type: simpleqa
29
+ value: 8.9
30
+ - name: MUSR
31
+ type: musr
32
+ value: 63.49
33
+ - name: MMLU (Zero Shot)
34
+ type: mmlu_zero_shot
35
+ value: 84.95
36
+ - name: Math-500
37
+ type: math_500
38
+ value: 92.1
39
+ - name: GPQA-Diamond
40
+ type: gpqa_diamond
41
+ value: 58.55
42
+ - name: BFCL V3
43
+ type: bfcl_v3
44
+ value: 59.67
45
+ source:
46
+ name: Model README
47
+ url: https://huggingface.co/arcee-ai/Trinity-Mini
48
  ---
49
  <div align="center">
50
  <picture>