Knowledge Benchmarks

General knowledge and factual understanding - Compare AI models across 4 specialized benchmarks including MMLU, ARC-Challenge, HellaSwag, GPQA, and more.

Filters & Search

Filter models by creator or search by name to find the perfect AI model for your needs

Knowledge Benchmark Results

Showing 15 of 20 models • Click column headers to sort

MMLUTests knowledge across 57 academic subjects
GPQAExpert-level questions in biology, physics, and chemistry
SuperGPQAEnhanced version covering 285 disciplines
OpenBookQAMulti-step reasoning with scientific facts
1
GPT-5 (high)
OpenAI
OpenAIClosed-source6991%89%87%85%
2
GPT-5 (medium)
OpenAI
OpenAIClosed-source6889%87%85%83%
3
Grok 4
xAI
xAIClosed-source6886%85%83%81%
4
o3-pro
OpenAI
OpenAIClosed-source6887%88%86%84%
5
o3
OpenAI
OpenAIClosed-source6785%86%84%82%
6
o4-mini (high)
OpenAI
OpenAIClosed-source6582%82%80%78%
7
Gemini 2.5 Pro
Google
GoogleClosed-source6583%83%81%79%
8
GPT-5 mini
OpenAI
OpenAIClosed-source6479%79%77%75%
9
Claude 4.1 Opus
Anthropic
AnthropicClosed-source6176%76%74%72%
10
Claude 4 Sonnet
Anthropic
AnthropicClosed-source5973%73%71%69%
11
Llama 3.1 405B
Meta
MetaOpen-source5870%70%68%66%
12
Mistral Large 2
Mistral
MistralOpen-source5768%68%66%64%
13
GPT-4o
OpenAI
OpenAIClosed-source5666%66%64%62%
14
Claude 3.5 Sonnet
Anthropic
AnthropicClosed-source5565%65%63%61%
15
Gemini 1.5 Pro
Google
GoogleClosed-source5464%64%62%60%

Showing 15 of 20 models

About Knowledge Benchmarks

MMLU

Tests knowledge across 57 academic subjects

GPQA

Expert-level questions in biology, physics, and chemistry

SuperGPQA

Enhanced version covering 285 disciplines

OpenBookQA

Multi-step reasoning with scientific facts