Name: GPT-5.4
Rating: 93 (22 reviews)
Author: OpenAI

Question 1

How does GPT-5.4 perform overall in AI benchmarks?

Accepted Answer

GPT-5.4 currently ranks #3 out of 110 models on BenchLM's provisional leaderboard with an overall score of 93. It also ranks #4 out of 14 on the verified leaderboard. It is created by OpenAI and features a 1.05M context window.

Question 2

Is GPT-5.4 good for knowledge and understanding?

Accepted Answer

GPT-5.4 ranks #2 out of 110 models in knowledge and understanding benchmarks with an average score of 96.9. It is among the top performers in this category.

Question 3

Is GPT-5.4 good for coding and programming?

Accepted Answer

GPT-5.4 ranks #4 out of 110 models in coding and programming benchmarks with an average score of 90.8. It is among the top performers in this category.

Question 4

Is GPT-5.4 good for agentic tool use and computer tasks?

Accepted Answer

GPT-5.4 ranks #3 out of 110 models in agentic tool use and computer tasks benchmarks with an average score of 92.2. It is among the top performers in this category.

Question 5

Is GPT-5.4 good for multimodal and grounded tasks?

Accepted Answer

GPT-5.4 ranks #15 out of 110 models in multimodal and grounded tasks benchmarks with an average score of 87.9. There are stronger options in this category.

Question 6

Which sibling models are related to GPT-5.4?

Accepted Answer

GPT-5.4 belongs to the GPT-5.4 family. Related variants on BenchLM include GPT-5.4 Pro, GPT-5.4 mini, GPT-5.4 nano.

Question 7

Does GPT-5.4 have full benchmark coverage on BenchLM?

Accepted Answer

Not yet. GPT-5.4 currently has 22 published benchmark scores out of the 152 benchmarks BenchLM tracks. BenchLM only exposes non-generated public benchmark rows, so missing categories stay blank until a sourced evaluation is available.

Question 8

What is the context window size of GPT-5.4?

Accepted Answer

GPT-5.4 has a context window of 1.05M, which determines how much text it can process in a single interaction.

GPT-5.4

Ranking Distribution

Category Performance

Category Breakdown

Agentic

Coding

Reasoning

Knowledge

Math

Multilingual

Multimodal

Inst. Following

Chatbot Arena Performance

Benchmark Details

GPT-5.4 Family

Compare This Model

Frequently Asked Questions