MIXED GLOBAL + REGIONAL

Korean Benchmarks Leaderboard

How do global frontier models stack up against regional Korean models on domestic tasks? This leaderboard ranks all models based exclusively on Korean benchmark performance like KMMLU, KMMLU-Hard, CLIcK, and KoBALT.

Claude Sonnet 4.6 currently leads the cross-market Korean view with an average score of 85.0.

This is the right page for deciding whether Korean-market specialists are actually outperforming global frontier models on Korean-native evaluation, rather than just inside a regional-only pool.

Rank	Model	Type	KMMLU	KMMLU-Hard	CLIcK	KoBALT	Avg Score
#1	Claude Sonnet 4.6 Anthropic	GLOBAL	—	—	—	—	85.0
#2	Solar 🇰🇷 Upstage	REGIONAL	—	—	—	—	80.1
#3	o1 OpenAI	GLOBAL	—	—	—	—	79.5
#4	HyperClova X 🇰🇷 Naver Cloud	REGIONAL	—	—	—	—	78.4
#5	GPT-5.4 OpenAI	GLOBAL	—	—	—	—	78.2
#6	A.X 🇰🇷 SK Telecom	REGIONAL	—	—	—	—	78.0
#7	K-Exaone 🇰🇷 LG AI Research	REGIONAL	—	—	—	—	76.0
#8	Exaone 4.0 🇰🇷 LG AI Research	REGIONAL	—	—	—	—	75.2
#9	GPT-5 OpenAI	GLOBAL	—	—	—	—	68.5
#10	GPT-5.2 OpenAI	GLOBAL	—	—	—	—	61.3
#11	GPT-5 OpenAI	GLOBAL	—	—	—	—	60.5
#12	GPT-5.1 OpenAI	GLOBAL	—	—	—	—	54.9
#13	GPT-4.1 OpenAI	GLOBAL	—	—	—	—	54.1
#14	GPT-4o OpenAI	GLOBAL	—	—	—	—	51.9
#15	GPT-4.1 OpenAI	GLOBAL	—	—	—	—	47.4
#16	GPT-4 Turbo OpenAI	GLOBAL	—	—	—	—	44.7
#17	GPT-4o OpenAI	GLOBAL	—	—	—	—	38.6
#18	GPT-4.1 OpenAI	GLOBAL	—	—	—	—	36.5

What these rows mean

KMMLU: Measures massive multitask language understanding on 45 Korean expert-level subjects.

KMMLU-Hard: A computationally heavier slice focusing on complex Korean reasoning where models struggle most.

How to interpret the crossover

While global frontier models like GPT-5 and Claude lead in general reasoning, models like HyperClova X and Exaone are explicitly trained on high-quality Korean corpora. This leaderboard tracks the crossover points between sheer model scale and regional specialization.

View regional-only Korean LLMs

Korean benchmark updates

Get leaderboard shifts when Korean benchmark scores change for either regional or global models.

Free. No spam. Unsubscribe anytime.

Recommended next step

If the mixed leaderboard shows a Korean-market model winning on your target rows, open its model page next and inspect the full score breakdown before choosing it over a global default.

Regional leaderboard KMMLU guide