Filter models by creator or search by name to find the perfect AI model for your needs
Showing 15 of 20 models • Click column headers to sort • Scroll horizontally for all benchmarks
Knowledge | Coding | Math | Reasoning | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
MMLU57 subjects | GPQAGraduate-level | SuperGPQA285 disciplines | OpenBookQAOpen book questions | HumanEval164 problems | AIME 2023Math competition | AIME 2024Math competition | AIME 2025Math competition | HMMT Feb 2023Harvard-MIT tournament | HMMT Feb 2024Harvard-MIT tournament | HMMT Feb 2025Harvard-MIT tournament | BRUMO 2025Math olympiad | SimpleQAFactuality | MuSRMulti-step reasoning | ||||
1 GPT-5 (high) OpenAI | OpenAI | Closed-source | 69 | 91% | 89% | 87% | 85% | 82% | 93% | 95% | 94% | 88% | 90% | 89% | 91% | 86% | 84% |
2 GPT-5 (medium) OpenAI | OpenAI | Closed-source | 68 | 89% | 87% | 85% | 83% | 80% | 91% | 93% | 92% | 86% | 88% | 87% | 89% | 84% | 82% |
3 Grok 4 xAI | xAI | Closed-source | 68 | 86% | 85% | 83% | 81% | 78% | 86% | 88% | 87% | 83% | 85% | 84% | 86% | 82% | 80% |
4 o3-pro OpenAI | OpenAI | Closed-source | 68 | 87% | 88% | 86% | 84% | 79% | 89% | 91% | 90% | 85% | 87% | 86% | 88% | 85% | 83% |
5 o3 OpenAI | OpenAI | Closed-source | 67 | 85% | 86% | 84% | 82% | 77% | 87% | 89% | 88% | 83% | 85% | 84% | 86% | 83% | 81% |
6 o4-mini (high) OpenAI | OpenAI | Closed-source | 65 | 82% | 82% | 80% | 78% | 74% | 83% | 85% | 84% | 79% | 81% | 80% | 82% | 80% | 78% |
7 Gemini 2.5 Pro Google | Closed-source | 65 | 83% | 83% | 81% | 79% | 75% | 84% | 86% | 85% | 80% | 82% | 81% | 83% | 81% | 79% | |
8 GPT-5 mini OpenAI | OpenAI | Closed-source | 64 | 79% | 79% | 77% | 75% | 71% | 80% | 82% | 81% | 76% | 78% | 77% | 79% | 77% | 75% |
9 Claude 4.1 Opus Anthropic | Anthropic | Closed-source | 61 | 76% | 76% | 74% | 72% | 68% | 76% | 78% | 77% | 72% | 74% | 73% | 75% | 74% | 72% |
10 Claude 4 Sonnet Anthropic | Anthropic | Closed-source | 59 | 73% | 73% | 71% | 69% | 65% | 73% | 75% | 74% | 69% | 71% | 70% | 72% | 71% | 69% |
11 Llama 3.1 405B Meta | Meta | Open-source | 58 | 70% | 70% | 68% | 66% | 62% | 70% | 72% | 71% | 66% | 68% | 67% | 69% | 68% | 66% |
12 Mistral Large 2 Mistral | Mistral | Open-source | 57 | 68% | 68% | 66% | 64% | 60% | 68% | 70% | 69% | 64% | 66% | 65% | 67% | 66% | 64% |
13 GPT-4o OpenAI | OpenAI | Closed-source | 56 | 66% | 66% | 64% | 62% | 58% | 66% | 68% | 67% | 62% | 64% | 63% | 65% | 64% | 62% |
14 Claude 3.5 Sonnet Anthropic | Anthropic | Closed-source | 55 | 65% | 65% | 63% | 61% | 57% | 65% | 67% | 66% | 61% | 63% | 62% | 64% | 63% | 61% |
15 Gemini 1.5 Pro Google | Closed-source | 54 | 64% | 64% | 62% | 60% | 56% | 64% | 66% | 65% | 60% | 62% | 61% | 63% | 62% | 60% |
Showing 15 of 20 models