N8Programs commited on
Commit
c0cf022
·
verified ·
1 Parent(s): 8e63027

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -2
README.md CHANGED
@@ -6,6 +6,9 @@ pipeline_tag: text-generation
6
  tags:
7
  - pretrained
8
  ---
 
 
 
9
  ## Model Summary
10
 
11
  NextTerm-47M is a pretrained transformer w/ 47.2M parameters, trained on 1.9 billion tokens of augmented data from the On-Line Encyclopedia of Integer Sequences (OEIS). It is designed to predict the next term in integer sequences. It displays exceptional in-context learning capabilities, and outperforms far larger generic LLMs on OEIS sequence completion tasks. It supports MLX and HuggingFace transformers.
@@ -32,8 +35,6 @@ Not all predictions are successful; example failure case below:
32
 
33
  ## Evaluation Results
34
 
35
- [![Radar Chart](radar_chart.png)](radar_chart.png)
36
-
37
  ### Arithmetic Evaluation
38
 
39
  The arithmetic evaluation consists of predicting the next term in sequences generated by polynomial functions of varying degrees (arithmetic, quadratic, cubic, quartic), across varying shot counts. The models are evaluated based on the accuracy of their predictions, w/ exact-match. NextTerm-47M outperforms all Qwen models <4B, though larger Qwen models do better on lower-degree polynomials.
 
6
  tags:
7
  - pretrained
8
  ---
9
+
10
+ [![Radar Chart](radar_chart.png)](radar_chart.png)
11
+
12
  ## Model Summary
13
 
14
  NextTerm-47M is a pretrained transformer w/ 47.2M parameters, trained on 1.9 billion tokens of augmented data from the On-Line Encyclopedia of Integer Sequences (OEIS). It is designed to predict the next term in integer sequences. It displays exceptional in-context learning capabilities, and outperforms far larger generic LLMs on OEIS sequence completion tasks. It supports MLX and HuggingFace transformers.
 
35
 
36
  ## Evaluation Results
37
 
 
 
38
  ### Arithmetic Evaluation
39
 
40
  The arithmetic evaluation consists of predicting the next term in sequences generated by polynomial functions of varying degrees (arithmetic, quadratic, cubic, quartic), across varying shot counts. The models are evaluated based on the accuracy of their predictions, w/ exact-match. NextTerm-47M outperforms all Qwen models <4B, though larger Qwen models do better on lower-degree polynomials.