Unified Model Leaderboard
Benchmarks, pricing, runtime signals, and context window in one table. Filter state syncs to the URL so every view is shareable. Provisional-ranked mode includes source-unverified non-generated benchmark evidence.
1 Claude Mythos Preview Anthropic | Anthropic | Closed | Current | Reasoning | 1M | $25.00 / $125.00 | N/A | N/A | 99 | 100 | 100 | — | 98 | 99 | 100 | 90 | — | — |
2 GPT-5.4 OpenAI | OpenAI | Closed | Current | Reasoning | 1.05M | $2.50 / $15.00 | 74 | 151.79s | 94 | 94 | 91 | 93 | 88 | 98 | 100 | 94 | 95 | 1465.79 |
3 Gemini 3.1 Pro Google | Closed | Current | Standard | 1M | $1.25 / $5.00 | 109 | 29.71s | 94 | 88 | 94 | 97 | 90 | 96 | 100 | 93 | 71 | 1492.63 | |
4 Claude Opus 4.6 Anthropic | Anthropic | Closed | Current | Standard | 1M | $15.00 / $75.00 | 40 | 1.78s | 92 | 93 | 91 | 90 | 84 | 92 | 100 | 95 | 89 | 1496.61 |
5 GPT-5.4 Pro OpenAI | OpenAI | Closed | Current | Reasoning | 1.05M | $30.00 / $180.00 | 74 | 151.79s | 92 | 92 | 93 | 99 | 100 | 60 | — | 94 | 100 | 1483.56 |
6 GPT-5.3 Codex OpenAI | OpenAI | Closed | Current | Reasoning | 400K | $2.50 / $10.00 | 79 | 88.26s | ~89 | 86 | 88 | 95 | 95 | 94 | 100 | 91 | 100 | 1416 |
7 Gemini 3 Pro Deep Think Google | Closed | Current | Reasoning | 2M | N/A | N/A | N/A | ~87 | 88 | 77 | 89 | 100 | 89 | 85 | 83 | 96 | 1486.39 | |
8 Claude Sonnet 4.6 Anthropic | Anthropic | Closed | Current | Standard | 200K | $3.00 / $15.00 | 44 | 1.48s | 86 | 85 | 83 | 83 | 95 | 85 | 91 | 82 | 78 | 1462.21 |
9 | Z.AI | Open | Current | Reasoning | 200K | $0.00 / $0.00 | N/A | N/A | ~85 | 86 | 76 | 88 | 73 | 84 | 82 | 81 | 93 | 1455.62 |
10 GLM-5.1 Z.AI | Z.AI | Open | Current | Reasoning | 203K | $1.40 / $4.40 | N/A | N/A | 84 | 83 | 83 | 65 | — | 85 | — | 93 | 89 | 1467.44 |
11 GPT-5.2 OpenAI | OpenAI | Closed | Current | Reasoning | 400K | $2.00 / $8.00 | 73 | 130.34s | ~84 | 66 | 84 | 86 | 86 | 93 | 99 | 86 | 84 | 1439.54 |
12 Gemini 3 Pro Google | Closed | Current | Standard | 2M | N/A | 109 | 32.65s | ~83 | 76 | 75 | 82 | 86 | 84 | 82 | 79 | 84 | 1486.16 | |
13 Grok 4.1 xAI | xAI | Closed | Superseded | Standard | 1M | $3.00 / $15.00 | N/A | N/A | ~81 | 73 | 69 | 92 | 98 | 95 | 100 | 86 | 92 | 1460.98 |
14 Qwen3.5 397B (Reasoning) Alibaba | Alibaba | Open | Current | Reasoning | 128K | $0.00 / $0.00 | N/A | N/A | ~81 | 77 | 85 | 82 | 59 | 80 | 86 | 82 | 92 | 1450 |
15 GPT-5.1 OpenAI | OpenAI | Closed | Current | Reasoning | 200K | $1.50 / $6.00 | 111 | 57.47s | ~81 | 81 | 81 | 68 | 96 | 84 | 86 | 78 | 70 | 1438.53 |
16 Claude Opus 4.5 Anthropic | Anthropic | Closed | Current | Standard | 200K | N/A | 46 | 1.01s | 80 | 81 | 79 | 70 | 72 | 84 | 84 | 58 | 95 | 1468 |
17 GPT-5 (high) OpenAI | OpenAI | Closed | Established | Reasoning | 128K | N/A | 83 | 36.28s | ~80 | 82 | 73 | 78 | 92 | 81 | 82 | 83 | 72 | 1433.37 |
18 GPT-5.2-Codex OpenAI | OpenAI | Closed | Current | Reasoning | 400K | $2.00 / $8.00 | 123 | 87.34s | ~80 | 84 | 81 | 89 | 89 | 80 | 88 | 93 | 98 | 1331 |
19 Kimi K2.5 (Reasoning) Moonshot AI | Moonshot AI | Closed | Current | Reasoning | 128K | N/A | N/A | N/A | ~79 | 69 | 87 | 70 | 71 | 75 | 90 | 100 | 68 | 1447 |
20 GPT-5.1-Codex-Max OpenAI | OpenAI | Closed | Current | Reasoning | 400K | $2.00 / $8.00 | N/A | N/A | ~79 | 81 | 79 | 90 | 90 | 81 | 86 | 89 | 97 | 1349 |
21 Grok 4.20 xAI | xAI | Closed | Current | Reasoning | 2M | $2.00 / $6.00 | 233 | 10.33s | 78 | 64 | 80 | 69 | 68 | — | — | 98 | — | 1490.38 |
22 GLM-5 Z.AI | Z.AI | Open | Superseded | Standard | 200K | $0.00 / $0.00 | 74 | 1.64s | 77 | 73 | 77 | 63 | 56 | 85 | 73 | 81 | 89 | 1455.57 |
23 Qwen3.6 Plus Alibaba | Alibaba | Closed | Current | Reasoning | 1M | $0.00 / $0.00 | N/A | N/A | 77 | 72 | 80 | 44 | 74 | 77 | 82 | 90 | — | — |
24 Gemma 4 31B Google | Open | Current | Reasoning | 256K | $0.00 / $0.00 | N/A | N/A | ~74 | — | 87 | 55 | 71 | 75 | — | — | — | 1451.16 | |
25 GPT-5 (medium) OpenAI | OpenAI | Closed | Established | Reasoning | 128K | N/A | 83 | 36.28s | ~74 | 75 | 81 | 75 | 89 | 76 | 87 | 78 | 92 | 1328 |