KOFFVQA Leaderboard

πŸ† Leaderboard | πŸ“„ KOFFVQA Arxiv | πŸ€— KOFFVQA Dataset

KOFFVQAπŸ” is a Free-Form VQA benchmark dataset designed to evaluate Vision-Language Models (VLMs) in Korean language environments. Unlike traditional multiple-choice or predefined answer formats, KOFFVQA challenges models to generate open-ended, natural-language answers to visually grounded questions. This allows for a more comprehensive assessment of a model's ability to understand and generate nuanced Korean responses.

The dataset encompasses diverse real-world scenarios, including object attributes, recognition, relationship, etc.

The page will be continuously updated and will accept requests to add models to the leaderboard. For more details, please refer to the "Submit" tab.

Model Size
Model Type
Rank
Name
Eval Date
Params (B)
⭐ Overall
object attributes
recognition
recognition-KO
relationship
KO-OCR
commonsense reasoning
document understanding
table understanding
graph and chart understanding
hallucination and robustness
10
2025-03-31
108.6
89.67
85.83
100
100
79.67
91.5
89.11
98.67
96.67
93.33
70