All OpenAI models ranked by benchmark performance — GPT-5, GPT-4o, o1, o3, and more.
Unless noted otherwise, ranking surfaces on this page use BenchLM's provisional leaderboard lane rather than the stricter sourced-only verified leaderboard.
Bottom line: GPT-5.4 is OpenAI's strongest model — it leads on knowledge and holds top 3 overall. GPT-5.4 Pro adds premium reasoning at higher cost. GPT-5.3 Codex is the coding specialist.
According to BenchLM.ai, GPT-5.4 leads this ranking with a score of 93, followed by GPT-5.4 Pro (92) and GPT-5.3 Codex (89). There is meaningful separation between the top models, suggesting genuine performance differences.
The best open-weight option is GPT-OSS 120B (ranked #21 with a score of 38). Proprietary models hold a clear advantage in this category, though open-weight options may suffice for less demanding use cases.
This ranking is based on provisional overall weighted scores across BenchLM.ai's scoring formula tracked by BenchLM.ai. For detailed model profiles, click any model name below. To compare two specific models head-to-head, use the "vs #" links.
GPT-5.4
OpenAI · 1.05M
GPT-5.4 Pro
OpenAI · 1.05M
GPT-5.3 Codex
OpenAI · 400K
GPT-5.4 leads OpenAI's lineup with the highest overall score and best knowledge (98).
GPT-5.4 Pro premium tier with perfect multimodal and math scores.
GPT-5.3 Codex coding-focused with perfect math and multilingual.
Get notified when models move. One email a week with what changed and why.
Free. No spam. Unsubscribe anytime.
The top model is GPT-5.4 by OpenAI with a provisional score of 93.
The best open-weight model is GPT-OSS 120B at position #21.
25 models are included in this ranking.
Models are ranked by the same overall BenchLM score used across all leaderboards. Comparing within OpenAI's lineup helps identify which model fits your use case and budget.
This page only shows OpenAI models. Cross-provider comparison requires the overall or category-specific leaderboards. Newer models may have limited benchmark coverage initially.
For engineers, researchers, and the plain curious — a weekly brief on new models, ranking shifts, and pricing changes.
Free. No spam. Unsubscribe anytime.