Shisa V2.1 Collection A family of bilingual JA/EN LLMs. https://shisa.ai/posts/shisa-v2.1/ • 5 items • Updated about 9 hours ago • 1
view article Article I Built a RAG System That Listens to Live BBC News and Answers Questions About "What Happened 10 Minutes Ago" about 24 hours ago • 6
Bigger-Personality Collection Dans-PersonalityEngine-V1.3.0-12b with Bigger-Body-12b to try to maintain Bigger-Body datasets while reducing initial message sensitivity. • 5 items • Updated Sep 28 • 1
Rose-Engine-12B Collection Dans-PersonalityEngine-V1.3.0-12b with heavy furry and NSFW biases. • 5 items • Updated 26 days ago • 1
view article Article Ellora: Enhancing LLMs with LoRA - Standardized Recipes for Capability Enhancement 7 days ago • 11
RAIF Collection Datasets and models in the paper "Incentivizing Reasoning for Advanced Instruction-Following of Large Language Models" [github.com/yuleiqin/RAIF]. • 12 items • Updated Jul 17 • 2
SPEAR Collection Checkpoints "Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning" arxiv [2509.22601] • 14 items • Updated 6 days ago • 1
SEFL: Synthetic Educational Feedback Loops Collection Models and data corresponding to the SEFL paper • 9 items • Updated Aug 1 • 2
fs1 Collection Models and data from "Scaling Reasoning can Improve Factuality in Large Language Models". • 20 items • Updated Sep 9 • 3
view article Article September(2025) LLM Commonsense & Social Benchmarks Report [Foresight Analysis] By (AIPRL-LIR) AI Parivartan Research Lab(AIPRL)-LLMs Intelligence Report 7 days ago • 2
view article Article DeepFabric: Generate, Train and Evaluate with Datasets curated for Model Behavior Training. 6 days ago • 5
MonikaV1 Collection My first group of Monika finetunes for MonikAI • 10 items • Updated 8 days ago • 1
Ministral 3 Collection Mistral Ministral 3: new multimodal models in Base, Instruct, and Reasoning variants, available in 3B, 8B, and 14B sizes. • 36 items • Updated 3 days ago • 21
Mistral Large 3 Collection A state-of-the-art, open-weight, general-purpose multimodal model with a granular Mixture-of-Experts architecture. • 4 items • Updated 8 days ago • 73
Ministral 3 Collection A collection of edge models, with Base, Instruct and Reasoning variants, in 3 different sizes: 3B, 8B and 14B. All with vision capabilities. • 9 items • Updated 8 days ago • 119
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 10 days ago • 233