view article Article Introducing Storage Buckets on the Hugging Face Hub +10 Wauplin, coyotte508, XciD, victor, julien-c, lhoestq, pierric, Sylvestre, hlarcher, rajatarya, seanses, assafvayner • Mar 10 • 194
view article Article Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers +5 ariG23498, sergiopaniego, reach-vb, pcuenq, ArthurZ, SaylorTwift, cyrilvallez • Sep 11, 2025 • 188
view article Article Accelerate ND-Parallel: A guide to Efficient Multi-GPU Training +3 smohammadi, siro1, winglian, marcsun13, djsaunde • Aug 8, 2025 • 98
Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning Paper • 2510.25992 • Published Oct 29, 2025 • 48
view article Article Gaia2 and ARE: Empowering the community to study agents +9 clefourrier, gregmialz, mlcu, mortimerp9, XciD, tfrere, evijit, RomainFroger, dheeraj7596, CarolinePascal, upiter • Sep 22, 2025 • 134
view article Article Ultra-Long Sequence Parallelism: Ulysses + Ring-Attention Technical Principles and Implementation exploding-gradients • Sep 16, 2025 • 20
Towards General Agentic Intelligence via Environment Scaling Paper • 2509.13311 • Published Sep 16, 2025 • 72
SuperBPE Collection SuperBPE tokenizers and models trained with them • 8 items • Updated Mar 23 • 17
💧 LFM2 Collection LFM2 is a new generation of hybrid models, designed for on-device deployment. • 28 items • Updated Apr 8 • 154
Hybrid Linear Attention Research Collection All 1.3B & 340M hybrid linear-attention experiments. • 62 items • Updated Sep 11, 2025 • 14
Avey 1 Research Preview Collection 1.5B preview models trained on 100B tokens of FineWeb, and an instruct-tuned version on smoltalk. • 3 items • Updated Jun 16, 2025 • 7
V-JEPA 2 Collection A frontier video understanding model developed by FAIR, Meta, which extends the pretraining objectives of https://ai.meta.com/blog/v-jepa-yann • 8 items • Updated Jun 13, 2025 • 218
Falcon-H1 Collection Falcon-H1 Family of Hybrid-Head Language Models (Transformer-SSM), including 0.5B, 1.5B, 1.5B-Deep, 3B, 7B, and 34B (pretrained & instruction-tuned). • 33 items • Updated Mar 2 • 59
Kimina Prover Preview Collection State-of-the-Art Models for Formal Mathematical Reasoning • 5 items • Updated Apr 28, 2025 • 33