KOFFVQA Leaderboard

🏆 Leaderboard | 📄 KOFFVQA Arxiv | 🤗 KOFFVQA Dataset

KOFFVQA🔍 is a Free-Form VQA benchmark dataset designed to evaluate Vision-Language Models (VLMs) in Korean language environments. Unlike traditional multiple-choice or predefined answer formats, KOFFVQA challenges models to generate open-ended, natural-language answers to visually grounded questions. This allows for a more comprehensive assessment of a model's ability to understand and generate nuanced Korean responses.

The dataset encompasses diverse real-world scenarios, including object attributes, recognition, relationship, etc.

The page will be continuously updated and will accept requests to add models to the leaderboard. For more details, please refer to the "Submit" tab.

Rank	Name	Eval Date	Params (B)	⭐ Overall	object attributes	recognition	recognition-KO	relationship	KO-OCR	commonsense reasoning	document understanding	table understanding	graph and chart understanding	hallucination and robustness
10	HyperCLOVAX-SEED-Vision-Instruct-3B	2025-03-31	108.6	89.67	85.83	100	100	79.67	91.5	89.11	98.67	96.67	93.33	70

Rank	Name	Eval Date	Params (B)	⭐ Overall	object attributes	recognition	recognition-KO	relationship	KO-OCR	commonsense reasoning	document understanding	table understanding	graph and chart understanding	hallucination and robustness
1	gemini-2.5-pro-exp-03-25	2025-03-31		89.67	86	90	90	79.67	95	89.11	100	96.67	93.33	70
2	gemini-2.5-flash-preview-04-17	2025-04-30		89.53	85.83	100	85	70.67	85	92.67	98.67	95.67	96.67	90
3	gemini-2.5-pro-preview-05-06	2025-05-09		88.98	84.67	90	90	77.67	95	89.33	100	96.67	91.33	70
4	o3-2025-04-16	2025-04-17		88.29	84.83	80	90	78	95	96.22	90	94	86.67	80
5	o1-2024-12-17	2025-03-31		88.25	83.33	100	100	86.67	95	92.67	90	86.67	83.33	80
6	o4-mini-2025-04-16	2025-04-17		87.85	84.33	90	70	82	95	84	100	95.33	90	80
7	gemini-2.5-flash-preview-05-20	2025-05-21		86.76	78	90	90	73.33	85	92	100	96.67	88	80
8	gpt-4.1-mini-2025-04-14	2025-04-17		85.82	82.5	90	60	79	90	89.56	93.33	85	90	90
9	gpt-4.1-2025-04-14	2025-04-17		85.71	88.33	90	85	83.33	95	91.56	78	88.67	73.33	80
10	chatgpt-4o-latest (2025-01-29)	2025-02-14		85.35	80.67	90	80	80	95	90.44	86.67	88.67	83.33	80
11	gpt-4.5-preview-2025-02-27	2025-02-28		85.31	85.5	80	80	76.67	95	93.56	90.67	93.33	70	70
12	gpt-4o-2024-11-20	2024-12-05		81.96	78.33	90	85	80	91.5	85.56	86.67	82.33	74.67	70
13	claude-3-5-sonnet-20241022	2024-12-05		80.47	81.83	90	80	66	76.5	88.89	78	73.67	88.67	80
14	Qwen2.5-VL-72B-Instruct	2025-02-05	73.4	80.11	81.83	90	45	63.67	95	85.33	83.33	84	76.67	80
15	Llama-4-Scout-17B-16E-Instruct	2025-04-11	108.6	79.93	84.33	70	40	61.67	88.5	83.56	82.67	85.33	83.33	90
16	gemini-2.0-flash-001	2025-02-12		79.09	73.5	80	70	65.33	93.5	72.67	90	96.67	88	50
17	gemini-2.0-flash-exp	2024-12-12		78.87	73.83	80	70	56.67	90	82.67	93.33	90	84.67	50
18	gpt-4o-2024-08-06	2024-12-05		77.6	77.5	80	90	64.67	80	87.56	77	82	68	70
19	gemini-2.0-pro-exp-02-05	2025-02-12		77.6	80	80	90	58	85	87.11	86	78.67	68	50
20	gemini-1.5-pro-002	2024-12-05		77.24	71.33	90	60	69.33	62.5	83.33	94.67	80	84.67	60
21	Llama-4-Maverick-17B-128E-Instruct	2025-04-22	401.6	76.95	79.17	70	20	74.33	78	76.44	86	80	83.33	80
22	gemini-2.0-flash-lite-001	2025-03-28		76.07	71	90	70	54.67	88	79.11	90	90	66.67	70
23	Qwen2-VL-72B-Instruct	2024-12-05	73.4	74.76	86.67	80	45	62.67	75	83.11	64	84.33	61.33	70
24	Qwen2.5-VL-32B-Instruct	2025-03-25	33.5	74.44	77	65	60	65.67	70	82.44	70.33	80.33	71.33	86
25	gemini-1.5-flash-002	2024-12-05		73.45	72.83	90	50	68	72.5	78.22	89.33	83.33	61.33	40
26	InternVL3-78B	2025-04-15	78.4	70.25	81.5	70	35	58	80	82.67	58.67	65.67	59.67	80
27	Ovis2-8B	2025-02-12	8.9	69.53	77.83	65	30	70.67	70	76.22	53.67	74	64	80
28	InternVL3-38B	2025-04-25	38.4	68.98	75.5	55	30	70.67	70	82.44	71.67	55	56.67	86
29	InternVL3-14B	2025-04-25	15.1	68.47	77.83	65	25	69.33	70	77.78	60.33	53.67	73.33	66
30	gpt-4o-mini-2024-07-18	2024-12-05		68.29	71.33	80	35	66.33	100	77.78	63	47.67	61.33	70
31	Llama-3.2-90B-Vision-Instruct	2024-12-09	88.6	67.93	75	75	40	62.67	80	76	56	55	68	76
32	Qwen2.5-VL-7B-Instruct	2025-02-04	8.3	67.75	78.33	50	25	55.33	85	71.56	74.67	61.67	58.67	75
33	gemma-3-27b-it	2025-03-13	27.4	67.56	66.33	90	40	68	85	83.78	47.33	55.67	73.33	50
34	InternVL2_5-78B	2024-12-09	78.4	67.16	71.33	75	25	66.67	70	78.89	60.67	51	68	85
35	Qwen2.5-Omni-7B	2025-03-31	8.9	66.29	65.33	70	25	64.67	90	72.22	72.67	61	55.33	70
36	VARCO-VISION-14B	2024-12-05	15.2	65.96	76.67	45	10	58	85	70.67	48.33	74	63.33	80
37	gemma-3-12b-it	2025-03-13	12.2	65.93	67.67	85	10	64.67	67.5	84.44	57.33	52.33	71.33	60
38	gpt-4.1-nano-2025-04-14	2025-04-17		65.53	71.5	80	20	61	60	76.89	71	45.67	68	70
39	gemini-2.0-flash-lite-preview-02-05	2025-02-12		65.24	59.83	75	60	43.67	68	72.67	80.33	71.67	63.33	60
40	gpt-4-turbo-2024-04-09	2024-12-05		65.2	76.67	90	60	76.33	30	80	47.33	39.33	64.67	80
41	claude-3-7-sonnet-20250219	2025-02-25		64.87	69.5	50	40	59	25	86.89	64.67	73	72	30
42	Ovis2-34B	2025-02-12	34.9	64.33	81	80	35	77	45	80.89	37.67	40	60	70
43	Qwen2-VL-7B-Instruct	2024-12-05	8.3	63.16	73.17	50	40	56	70	74.89	64.33	50	53.33	60
44	gemini-1.5-flash-8b-001	2024-12-05		61.85	68.67	50	15	46.67	35	82.44	91.33	51.33	53.33	55
45	Llama-3.2-MAAL-11B-Vision-v0.1	2024-12-05	11.1	61.13	78	65	5	51.67	63	77.33	53	45	50	70
46	InternVL2_5-38B	2024-12-16	38.4	60.25	64.5	35	15	57.67	45	69.33	56.33	58.67	71.33	86
47	HyperCLOVAX-SEED-Vision-Instruct-3B	2025-04-24	3.7	59.6	73.83	50	20	68.67	75	67.56	56.33	35.33	37	80
48	claude-3-opus-20240229	2024-12-13		58.91	65.67	80	25	67.67	75	82.67	45	31.67	40	46
49	InternVL3-9B	2025-04-14	9.1	57.02	64.5	35	0	50.33	77.5	62.22	67	50	54.67	45
50	InternVL3-8B	2025-04-14	7.9	55.82	58.5	25	20	62.33	70	66.44	47.33	40.33	65.33	54
51	llava-onevision-qwen2-72b-ov-sft	2024-12-09	73.2	54.15	77.83	65	5	71	5	79.33	23	18.33	54	86
52	aya-vision-32b	2025-03-12	33.1	52.07	68.83	100	35	70	10	76.89	24.33	16.33	38.67	70
53	InternVL2-Llama3-76B	2024-12-09	76.3	51.89	72.67	90	15	60	5	77.11	20.33	14	56.67	76
54	Kimi-VL-A3B-Instruct	2025-04-16	16.4	51.27	64.33	50	5	55.67	65	56	41.33	33.33	48.67	50
55	claude-3-haiku-20240307	2024-12-05		50.47	59.83	70	0	57.33	50	73.56	26.67	25.33	46.67	60
56	Llama-3.2-11B-Vision-Instruct	2024-12-05	10.7	50.36	55.5	80	25	62	62.5	58.67	36	24.67	46.67	50
57	Qwen2.5-VL-3B-Instruct	2025-02-04	3.8	49.56	54.33	40	10	32.33	85	55.11	53.67	48.33	42	40
58	Mistral-Small-3.1-24B-Instruct-2503	2025-04-11	24	49.2	60.33	55	15	62.67	15	62.44	29	38.33	46.67	80
59	gemma-3-4b-it	2025-03-13	4.3	49.2	68.83	75	5	45.67	30	78.44	23	18	48.67	41
60	InternVL2_5-26B	2024-12-16	25.5	44.95	57.5	25	10	34	55	60.67	32	39.67	40	36
61	Pixtral-12B-2409	2024-12-05	12.7	44.62	61.67	65	10	45.67	10	65.11	17.67	23.67	56	40
62	aya-vision-8b	2025-03-12	8.6	44.44	70.67	75	25	48.33	5	72.67	27.67	9.67	14.67	60
63	InternVL2_5-8B	2024-12-09	8.1	44.18	56.33	25	5	40.67	30	57.33	44	21.33	56.67	41
64	Aria	2024-12-11	25.3	44.15	66.5	55	0	50.67	5	79.33	17.33	9	37.33	50
65	Qwen2-VL-2B-Instruct	2024-12-11	2.2	43.75	61	55	20	37.33	35	48.22	54	19.33	33.33	43
66	InternVL3-2B	2025-04-25	2.1	43.67	47.17	5	15	34.33	71.5	52.67	47	27.33	50.67	40
67	llava-onevision-qwen2-7b-ov	2024-12-05	8	40.98	60.33	35	15	54.33	5	63.33	21.67	6.67	30.67	80
68	llama-3.2-Korean-Bllossom-AICA-5B	2024-12-16	5.2	40.18	47.83	35	0	43.33	35	51.56	36.67	23	50.67	20
69	Ovis1.6-Gemma2-9B	2024-12-05	10.2	38.98	73.17	55	0	33	10	45.11	15	17	31.33	66
70	Molmo-72B-0924	2024-12-09	73.3	36.58	45.5	25	5	47.67	0	54.44	18.33	20.67	42.67	70
71	InternVL3-1B	2025-04-25	0.9	33.75	28.83	20	0	26	50	44	46	20	38.67	45
72	MiniCPM-V-2_6	2024-12-05	8.1	32.69	56	35	0	33.67	5	59.78	20.67	5.33	16.67	20
73	InternVL2-8B	2024-12-05	8.1	32.04	49.67	20	0	36	5	36.22	24.67	16.33	31	66
74	Phi-3.5-vision-instruct	2024-12-16	4.1	31.89	41.17	0	5	39.33	10	39.78	21.67	19.33	40	65
75	claude-3-sonnet-20240229	2024-12-13		31.85	28.5	20	0	32	27.5	30.22	38	49.33	42	10
76	Molmo-7B-D-0924	2024-12-05	8	30.25	52.5	20	0	32	5	47.78	12	9.33	27.33	30
77	Phi-4-multimodal-instruct	2025-04-01	5.6	30.18	42.17	10	0	23.67	11.5	39.56	25.67	24	40	26
78	Ovis1.6-Gemma2-27B	2024-12-05	28.9	30.18	43.33	50	0	11	5	43.56	29	11.33	46.67	20
79	MAmmoTH-VL-8B	2024-12-16	8	25.96	36.33	60	0	17.33	5	33.11	24.33	5.33	42	10
80	SmolVLM-Instruct	2024-12-11	2.2	21.31	14	0	5	36.33	5	31.33	21	8.67	36.67	38
81	Idefics3-8B-Llama3	2024-12-11	8.5	18.51	26.67	0	0	23.33	0	20	21	5.33	33.33	10
82	internlm-xcomposer2d5-7b	2024-12-05	11.1	8.33	14	20	0	3.33	5	10.89	6.67	0	5.33	20

KOFFVQA Leaderboard

About

News

Citation

Submit