Nebula

Nebula logo

1. Introduction

Nebula is a 320M-parameter generalist Small Reasoning Model trained on 200B+ tokens, designed for edge AI and on-device deployment.

Nebula is designed to deliver an unusually strong balance of memory, general reasoning, math, and retrieval-friendly behavior for its size class, aiming to outperform many small models of a similar parameter range on non-code, industry-style benchmarks.

2. Reasoning style

Nebula’s reasoning traces use an intentionally compact style with dense, short, frequently non-verbal sentences, optimized for efficiency under limited model capacity.

Traces use the following stenographic notation integrated into special tokens:

Logical markers

Token	Meaning	Usage
→	derivation / implication	For very short causal/logical flow
↺	iterative return / refinement loop	For backtracking, reconsidering priors, RAG re-querying
?	uncertainty/questions to resolve	Can be appended to short expressions/words, not only interrogatives
!/※	insight/breakthroughs	Emphatic mark for knowledge discovery
≈	approximation/estimates	For intermediary hypothesis / uncertain preliminary statements
∴	therefore / final step	Use sparingly to mark stable conclusions

Uncertainty

Token	Meaning	Usage
●	high confidence	well-supported empirical/theoretical ground; “anchor points.”
◐	medium/partial confidence	incomplete data; plausible but unverified links
○	low confidence	speculation, missing context, weak inference chain
⚠	bias/premise risk	domain mismatch, cultural assumptions, language-switch artifacts
?maybe?	soft speculation	marks tentative ideas, branches that might collapse later

Verification process

Token	Meaning	Usage
☐	unverified hypothesis	raw claim, no cross-check yet
☑	intermediate verification	one source/argument supports it
✓	confirmed/validated	multiple independent supports (●-level)

This reasoning format is designed to remain expressive while being lightweight enough for a small model.

3. Fine-Tuning/RL

Nebula has been successfully fine-tuned for a variety of tasks

Because Nebula is a reasoning-oriented model, it is expected to train well with reinforcement learning methods such as GRPO, both for verifiable tasks (with objective rewards) and for subjective tasks using an LLM-as-a-judge.

4. Benchmarks

Model	MMLU
Nebula	40.0
SmolLM2-360M	35.8
Gemma 3 270M (IT)	26.5
Granite-4.0-H-350M	36.21

Downloads last month: 328

Safetensors

Model size

0.3B params

Tensor type

BF16