We sincerely apoligise for the errors before regarding the gibberish. We have repaired the model and it is useable in the beta phase.

BBLM (Brain Box Language Model) - Gibberish Fix

Novel Architecture

This is a completely new language model architecture with GUARANTEED stability!

Components

  1. Persistent Associative Memory (PAM)

    • Differentiable memory matrix with learned read/write heads
    • Multi-hop reading for complex retrieval
    • No RNN - pure associative storage
  2. Brain Box Layers (8 layers)

    • Specialized neural regions:
      • Syntax: Grammar and structure
      • Semantic: Meaning and concepts
      • Logic: Reasoning and inference
      • Context: Long-range dependencies
      • Pattern: Repetition detection
    • Sparse routing (only 2 regions active per token)
    • Feedback loops with adaptive iterations
  3. Think Box Layers (12 layers)

    • Latent reasoning in compressed space
    • Chain-of-thought without token generation
    • Planning and verification regions
  4. Router Networks

    • Learns which regions to activate
    • Adapts computation to problem difficulty
    • Sparse activation for efficiency

Parameters

  • Total: 393,747,608 (~393.7M)
  • Hidden dim: 860
  • Memory slots: 100
  • Max sequence: 1024

Key Innovations

  • βœ“ No quadratic attention cost
  • βœ“ Adaptive compute per token
  • βœ“ True persistent memory
  • βœ“ Compositional reasoning
  • βœ“ Brain-inspired specialization
  • βœ“ Interpretable routing decisions

Stability Features πŸ›‘οΈ

  • βœ“ Gradient clipping everywhere
  • βœ“ Stable softmax (subtract max)
  • βœ“ NaN guards in all layers
  • βœ“ Conservative initialization
  • βœ“ Scaled residuals (0.1x)
  • βœ“ Label smoothing
  • βœ“ Logit clipping
  • βœ“ Float32 computation for norms
  • βœ“ OOM-proof memory management
  • βœ“ Aggressive garbage collection between epochs

Training Stats

See training_history.csv for metrics including:

  • Cross-entropy loss
  • Z-loss (stability)
  • Gradient norms
  • Accuracy & Perplexity
  • Average halts per layer
Downloads last month
17
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Smilyai-labs/Nova-1-002

Finetuned
(1)
this model