Update README.md
Browse files
README.md
CHANGED
|
@@ -127,7 +127,7 @@ Moo Moo the cow would certinaly win.
|
|
| 127 |
|
| 128 |
## Evaluation
|
| 129 |
|
| 130 |
-
| Skill | Benchmark | Olmo 3 7B
|
| 131 |
|-------|-----------|------------------|------------------|--------------|------------------|-----------------------|------------------------------|-------------------------|---------------------------|-----------------------------|
|
| 132 |
| **Math** | MATH | 94.4 | 92.4 | 95.1 | 94.5 | 94.4 | 87.9 | 95.1 | 95.2 | 94.6 |
|
| 133 |
| | AIME 2024 | 69.6 | 74.6 | 71.6 | 67.7 | 72.1 | 54.9 | 74.0 | 70.9 | 77.0 |
|
|
|
|
| 127 |
|
| 128 |
## Evaluation
|
| 129 |
|
| 130 |
+
| Skill | Benchmark | Olmo 3 Think 7B SFT | Olmo 3 Think 7B DPO | Olmo 3 Think 7B | OpenThinker3-7B | Nemotron-Nano-9B-v2 | DeepSeek-R1-Distill-Qwen-7B | Qwen 3 8B (reasoning) | Qwen 3 VL 8B Thinker | OpenReasoning Nemotron 7B |
|
| 131 |
|-------|-----------|------------------|------------------|--------------|------------------|-----------------------|------------------------------|-------------------------|---------------------------|-----------------------------|
|
| 132 |
| **Math** | MATH | 94.4 | 92.4 | 95.1 | 94.5 | 94.4 | 87.9 | 95.1 | 95.2 | 94.6 |
|
| 133 |
| | AIME 2024 | 69.6 | 74.6 | 71.6 | 67.7 | 72.1 | 54.9 | 74.0 | 70.9 | 77.0 |
|