All OpenAI models ranked by benchmark performance — GPT-5, GPT-4o, o1, o3, and more.
Unless noted otherwise, ranking surfaces on this page use BenchLM's provisional leaderboard lane rather than the stricter sourced-only verified leaderboard.
Bottom line: GPT-5.4 is OpenAI's strongest model — it leads on knowledge and holds top 3 overall. GPT-5.4 Pro adds premium reasoning at higher cost. GPT-5.3 Codex is the coding specialist.
According to BenchLM.ai, GPT-5.5 leads this ranking with a score of 91, followed by GPT-5.4 Pro (91) and GPT-5.4 (89). The top three are separated by just a few points — any of them would perform well for this use case.
The best open-weight option is GPT-OSS 120B (ranked #21 with a score of 34). Proprietary models hold a clear advantage in this category, though open-weight options may suffice for less demanding use cases.
This ranking is based on provisional overall weighted scores across BenchLM.ai's scoring formula tracked by BenchLM.ai. For detailed model profiles, click any model name below. To compare two specific models head-to-head, use the "vs #" links.
GPT-5.5
OpenAI · 1M
GPT-5.4 Pro
OpenAI · 1.05M
GPT-5.4
OpenAI · 1.05M
GPT-5.4 leads OpenAI's lineup with the highest overall score and best knowledge (98).
GPT-5.4 Pro premium tier with perfect multimodal and math scores.
GPT-5.3 Codex coding-focused with perfect math and multilingual.
Get notified when models move. One email a week with what changed and why.
Free. No spam. Unsubscribe anytime.
The top model is GPT-5.5 by OpenAI with a provisional score of 91.
The best open-weight model is GPT-OSS 120B at position #21.
25 models are included in this ranking.
Models are ranked by the same overall BenchLM score used across all leaderboards. Comparing within OpenAI's lineup helps identify which model fits your use case and budget.
This page only shows OpenAI models. Cross-provider comparison requires the overall or category-specific leaderboards. Newer models may have limited benchmark coverage initially.
For engineers, researchers, and the plain curious — a weekly brief on new models, ranking shifts, and pricing changes.
Free. No spam. Unsubscribe anytime.