Head-to-head comparison across 8 benchmark categories
GPT-5.4 Pro
92
Claude Opus 4.6
85
Pick GPT-5.4 Pro if you want the stronger benchmark profile. Claude Opus 4.6 only becomes the better choice if you want the cheaper token bill or you would rather avoid the extra latency and token burn of a reasoning model.
Agentic
+15.1 difference
Coding
+15.2 difference
Reasoning
+13.3 difference
Knowledge
+7.1 difference
Math
+1.0 difference
Multilingual
+1.0 difference
Multimodal
+10.1 difference
Inst. Following
+2.0 difference
GPT-5.4 Pro
Claude Opus 4.6
$30 / $180
$15 / $75
74 t/s
40 t/s
151.79s
1.78s
1.05M
1M