N8Programs
/

NextTerm-47M

Text Generation

Model card Files Files and versions

N8Programs commited on Nov 27

Commit

c0cf022

·

verified ·

1 Parent(s): 8e63027

Update README.md

Files changed (1) hide show

README.md +3 -2

README.md CHANGED Viewed

@@ -6,6 +6,9 @@ pipeline_tag: text-generation
 tags:
 - pretrained
 ---
 ## Model Summary
 NextTerm-47M is a pretrained transformer w/ 47.2M parameters, trained on 1.9 billion tokens of augmented data from the On-Line Encyclopedia of Integer Sequences (OEIS). It is designed to predict the next term in integer sequences. It displays exceptional in-context learning capabilities, and outperforms far larger generic LLMs on OEIS sequence completion tasks. It supports MLX and HuggingFace transformers.
@@ -32,8 +35,6 @@ Not all predictions are successful; example failure case below:
 ## Evaluation Results
-[![Radar Chart](radar_chart.png)](radar_chart.png)
 ### Arithmetic Evaluation
 The arithmetic evaluation consists of predicting the next term in sequences generated by polynomial functions of varying degrees (arithmetic, quadratic, cubic, quartic), across varying shot counts. The models are evaluated based on the accuracy of their predictions, w/ exact-match. NextTerm-47M outperforms all Qwen models <4B, though larger Qwen models do better on lower-degree polynomials.

 tags:
 - pretrained
 ---
+[![Radar Chart](radar_chart.png)](radar_chart.png)
 ## Model Summary
 NextTerm-47M is a pretrained transformer w/ 47.2M parameters, trained on 1.9 billion tokens of augmented data from the On-Line Encyclopedia of Integer Sequences (OEIS). It is designed to predict the next term in integer sequences. It displays exceptional in-context learning capabilities, and outperforms far larger generic LLMs on OEIS sequence completion tasks. It supports MLX and HuggingFace transformers.
 ## Evaluation Results
 ### Arithmetic Evaluation
 The arithmetic evaluation consists of predicting the next term in sequences generated by polynomial functions of varying degrees (arithmetic, quadratic, cubic, quartic), across varying shot counts. The models are evaluated based on the accuracy of their predictions, w/ exact-match. NextTerm-47M outperforms all Qwen models <4B, though larger Qwen models do better on lower-degree polynomials.