Claude 4 Sonnet vs GPT-5.2-Codex

Side-by-side benchmark comparison across knowledge, coding, math, and reasoning.

Quick Verdict

GPT-5.2-Codex wins overall with a score of 82 vs 59 (23 point difference).GPT-5.2-Codex wins 4 out of 4 categories.

Knowledge

GPT-5.2-Codex

Claude 4 Sonnet

71.5

GPT-5.2-Codex

96

73
MMLU
99
73
GPQA
97
71
SuperGPQA
95
69
OpenBookQA
93

Coding

GPT-5.2-Codex

Claude 4 Sonnet

65

GPT-5.2-Codex

95

65
HumanEval
95

Mathematics

GPT-5.2-Codex

Claude 4 Sonnet

72

GPT-5.2-Codex

97.1

73
AIME 2023
99
75
AIME 2024
99
74
AIME 2025
98
69
HMMT Feb 2023
95
71
HMMT Feb 2024
97
70
HMMT Feb 2025
96
72
BRUMO 2025
96

Reasoning

GPT-5.2-Codex

Claude 4 Sonnet

70

GPT-5.2-Codex

94

71
SimpleQA
95
69
MuSR
93

Frequently Asked Questions

Which is better, Claude 4 Sonnet or GPT-5.2-Codex?

GPT-5.2-Codex scores higher overall with 82 vs 59, a difference of 23 points across all benchmarks.

Which is better for knowledge tasks, Claude 4 Sonnet or GPT-5.2-Codex?

GPT-5.2-Codex leads in knowledge tasks with an average score of 96 vs 71.5.

Which is better for coding, Claude 4 Sonnet or GPT-5.2-Codex?

GPT-5.2-Codex leads in coding with an average score of 95 vs 65.

Which is better for math, Claude 4 Sonnet or GPT-5.2-Codex?

GPT-5.2-Codex leads in math with an average score of 97.1 vs 72.

Which is better for reasoning, Claude 4 Sonnet or GPT-5.2-Codex?

GPT-5.2-Codex leads in reasoning with an average score of 94 vs 70.