view article Article Luth: Efficient French Specialization for Small Language Models MaxLSB • Aug 11, 2025 • 20
VLMs Need Words: Vision Language Models Ignore Visual Detail In Favor of Semantic Anchors Paper • 2604.02486 • Published Apr 2 • 10
Lost in Cultural Translation: Do LLMs Struggle with Math Across Cultural Contexts? Paper • 2503.18018 • Published Mar 23, 2025 • 7
On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification Paper • 2508.05629 • Published Aug 7, 2025 • 189
Reasoning Models Struggle to Control their Chains of Thought Paper • 2603.05706 • Published Mar 5 • 37
view article Article TextQuests: How Good are LLMs at Text-Based Video Games? justinphan3110, clefourrier • Aug 12, 2025 • 38
view article Article Supercharge your OCR Pipelines with Open Models +5 merve, ariG23498, davanstrien, hynky, andito, reach-vb, pcuenq • Oct 21, 2025 • 312
FrenchBench Evaluation datasets Collection These datasets are used to evaluate models on French performance using: https://github.com/EleutherAI/lm-evaluation-harness (from CroissantLLM paper) • 11 items • Updated Jun 7, 2024 • 8