Nebula

Nebula logo

1. Introduction

Nebula is a 320M-parameter generalist Small Reasoning Model trained on 200B+ tokens, designed for edge AI and on-device deployment.

Nebula is designed to deliver an unusually strong balance of memory, general reasoning, math, and retrieval-friendly behavior for its size class, aiming to outperform many small models of a similar parameter range on non-code, industry-style benchmarks.

2. Reasoning style

Nebula’s reasoning traces use an intentionally compact style with dense, short, frequently non-verbal sentences, optimized for efficiency under limited model capacity.

Traces use the following stenographic notation integrated into special tokens:

Logical markers

Token Meaning Usage
derivation / implication For very short causal/logical flow
iterative return / refinement loop For backtracking, reconsidering priors, RAG re-querying
? uncertainty/questions to resolve Can be appended to short expressions/words, not only interrogatives
!/※ insight/breakthroughs Emphatic mark for knowledge discovery
approximation/estimates For intermediary hypothesis / uncertain preliminary statements
therefore / final step Use sparingly to mark stable conclusions

Uncertainty

Token Meaning Usage
high confidence well-supported empirical/theoretical ground; “anchor points.”
medium/partial confidence incomplete data; plausible but unverified links
low confidence speculation, missing context, weak inference chain
bias/premise risk domain mismatch, cultural assumptions, language-switch artifacts
?maybe? soft speculation marks tentative ideas, branches that might collapse later

Verification process

Token Meaning Usage
unverified hypothesis raw claim, no cross-check yet
intermediate verification one source/argument supports it
confirmed/validated multiple independent supports (●-level)

This reasoning format is designed to remain expressive while being lightweight enough for a small model.

3. Fine-Tuning/RL

Nebula has been successfully fine-tuned for a variety of tasks

Because Nebula is a reasoning-oriented model, it is expected to train well with reinforcement learning methods such as GRPO, both for verifiable tasks (with objective rewards) and for subjective tasks using an LLM-as-a-judge.

4. Benchmarks

Model MMLU
Nebula 40.0
SmolLM2-360M 35.8
Gemma 3 270M (IT) 26.5
Granite-4.0-H-350M 36.21
Downloads last month
328
Safetensors
Model size
0.3B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support