GPT-5.4 mini Benchmark Scores & Performance

BenchLM is tracking GPT-5.4 mini by OpenAI. Some benchmark data is visible, but trusted coverage is not complete enough for ranking yet.

BenchLM is tracking GPT-5.4 mini, but this profile is currently excluded from the trusted leaderboard because its source-backed benchmark coverage is not complete enough yet. We keep the model metadata and any verified benchmark rows visible while the rest of the public eval record is re-checked.

GPT-5.4 mini is a proprietary model with a 400K token context window. It uses explicit chain-of-thought reasoning, which typically improves performance on math and complex reasoning tasks at the cost of higher latency and token usage.

GPT-5.4 mini sits inside the GPT-5.4 family alongside GPT-5.4, GPT-5.4 Pro, GPT-5.4 nano. BenchLM links it directly to GPT-5 mini as the earlier related model in that lineage. This profile currently has 1 trusted benchmark rows on BenchLM, but that is not enough for a leaderboard rank yet.

Creator

OpenAI

Source Type

Proprietary

Reasoning

Reasoning

Context Window

400K

Overall Score

Not ranked yet

Family & Lineage

Family

GPT-5.4

Mini

Canonical Entry

GPT-5.4

Related Earlier Model

GPT-5 mini

Rankings Overview

BenchLM is still verifying enough trusted benchmark coverage to place this model in the leaderboard. Category ranks will appear here once that source-backed coverage is complete.

Knowledge Benchmarks

GPQA
88%
HLE
41.5%
HLE w/o tools
28.2%

Coding Benchmarks

SWE-bench Pro
54.4%

Reasoning Benchmarks

MRCRv2
40.7%
MRCR v2 64K-128K
47.7%
MRCR v2 128K-256K
33.6%
Graphwalks BFS 128K
76.3%
Graphwalks Parents 128K
71.5%

Agentic Benchmarks

Terminal-Bench 2.0
60%
OSWorld-Verified
72.1%
MCP Atlas
57.7%
Toolathlon
42.9%
tau2-bench
93.4%

Multimodal & Grounded Benchmarks

MMMU-Pro
76.6%
MMMU-Pro w/ Python
78%
OmniDocBench 1.5
0.1263

Frequently Asked Questions

How does GPT-5.4 mini perform overall in AI benchmarks?

BenchLM is tracking GPT-5.4 mini, but trusted source-backed benchmark coverage is still coming soon. We currently list its creator, model type, and context window while we wait for verified public benchmark results.

Which sibling models are related to GPT-5.4 mini?

GPT-5.4 mini belongs to the GPT-5.4 family. Related variants on BenchLM include GPT-5.4, GPT-5.4 Pro, GPT-5.4 nano.

Does GPT-5.4 mini have full benchmark coverage on BenchLM?

GPT-5.4 mini is tracked on BenchLM, but its current source-backed benchmark coverage is not strong enough for a trusted leaderboard rank yet. We keep the model page live while we verify more public benchmark results.

What is the context window size of GPT-5.4 mini?

GPT-5.4 mini has a context window of 400K, which determines how much text it can process in a single interaction.

Last updated: March 17, 2026

Weekly LLM Updates

New model releases, benchmark scores, and leaderboard changes. Every Friday.

Free. Your signup is stored with a derived country code for compliance routing.