Watch the AI race unfold
Scrub the timeline to travel through 18 months of model releases. See who held the crown, who climbed the ranks, and how the benchmark landscape evolved.
Monthly Race Snapshot
Export or embed the live race view for the currently selected month.
Claude Opus 4.7
Anthropic
Releases this month
14 modelsClaude Mythos Preview
Anthropic
Claude Opus 4.7
Anthropic
GLM-5.1
Z.AI
Qwen3.6 Plus
Alibaba
Gemma 4 31B
Qwen3.6-35B-A3B
Alibaba
Gemma 4 26B A4B
Muse Spark
Meta
Ternary Bonsai 8B
Prism ML
Gemma 4 E4B
Ternary Bonsai 1.7B
Prism ML
Gemma 4 E2B
Ternary Bonsai 4B
Prism ML
LFM2.5-VL-450M
LiquidAI
Provider Race
Cumulative avg. top-3 score through Apr 2026Benchmark Health
How fresh are the benchmarks we use to score models? Green means the benchmark is actively separating models. Red means scores are bunching up or the benchmark is outdated.
Agentic
29 benchmarksCoding
15 benchmarksReasoning
14 benchmarksMultimodal
41 benchmarksKnowledge
18 benchmarksMultilingual
8 benchmarksInstruction Following
2 benchmarksMath
17 benchmarksWant deeper historical analysis?
Explore 21 months of Arena Elo ratings — crown changes, provider dominance, open-source gap tracking, and more.
LLM Leaderboard History