Interfaze: The Future of AI is built on Task-Specific Small Models Paper • 2602.04101 • Published Feb 4 • 4
The Structured Output Benchmark: A Multi-Source Benchmark for Evaluating Structured Output Quality in Large Language Models Paper • 2604.25359 • Published 9 days ago
Beyond Visual Understanding: Introducing PARROT-360V for Vision Language Model Benchmarking Paper • 2411.15201 • Published Nov 20, 2024