badmadrad/Devstral-Small-2-24B-Instruct-2512-MLX-3bit Text Generation • 24B • Updated 6 days ago • 77
abhiv26/Qwen2.5-7B-Instruct-ToolRL-PPO-Cold-Equal-Max Reinforcement Learning • 8B • Updated 4 days ago • 6
Kumeichi/qwen3-4b-agent-trajectory-lora-SFT-SQL-ALFWorld_rev.0 Text Generation • 4B • Updated 3 days ago