GPT-4.1 nano Benchmark Scores & Performance

Benchmark analysis of GPT-4.1 nano by OpenAI across 4 tests.

According to BenchLM.ai, GPT-4.1 nano ranks #100 out of 100 models with an overall score of 23/100. While not a frontier model, it offers specific advantages depending on the use case.

GPT-4.1 nano is a proprietary model with a 1M token context window. It processes queries without explicit chain-of-thought reasoning, offering faster response times and lower token usage.

GPT-4.1 nano sits inside the GPT-4.1 family alongside GPT-4.1, GPT-4.1 mini. This profile currently has 4 of 22 tracked benchmarks, so the overall score is conservative until the rest of the suite is filled in.

Its strongest category is Instruction Following (#45), while its weakest is Coding (#100). This performance profile makes it a well-rounded choice across a range of tasks.

Creator

OpenAI

Source Type

Proprietary

Reasoning

Non-Reasoning

Context Window

1M

Overall Score

23#100 of 100

Family & Lineage

Family

GPT-4.1

Nano

Canonical Entry

GPT-4.1

Knowledge Benchmarks

MMLU
80.1
GPQA
50.3

Mathematics Benchmarks

AIME 2024
9.8

Instruction Following Benchmarks

IFEval
83.2

Frequently Asked Questions

How does GPT-4.1 nano perform overall in AI benchmarks?

GPT-4.1 nano ranks #100 out of 100 models with an overall score of 23. It is created by OpenAI and features a 1M context window.

Is GPT-4.1 nano good for knowledge and understanding?

GPT-4.1 nano ranks #49 out of 100 models in knowledge and understanding benchmarks with an average score of 65.2. There are stronger options in this category.

Is GPT-4.1 nano good for mathematics?

GPT-4.1 nano ranks #96 out of 100 models in mathematics benchmarks with an average score of 9.8. There are stronger options in this category.

Is GPT-4.1 nano good for instruction following?

GPT-4.1 nano ranks #45 out of 100 models in instruction following benchmarks with an average score of 83.2. There are stronger options in this category.

Which sibling models are related to GPT-4.1 nano?

GPT-4.1 nano belongs to the GPT-4.1 family. Related variants on BenchLM include GPT-4.1, GPT-4.1 mini.

Does GPT-4.1 nano have full benchmark coverage on BenchLM?

Not yet. GPT-4.1 nano currently has 4 sourced benchmark scores out of the 22 benchmarks BenchLM tracks, so its overall score is intentionally conservative until more results are added.

What is the context window size of GPT-4.1 nano?

GPT-4.1 nano has a context window of 1M tokens, which determines how much text it can process in a single interaction.

Last updated: March 9, 2026

Weekly LLM Updates

New model releases, benchmark scores, and leaderboard changes. Every Friday.

Free. Your signup is stored with a derived country code for compliance routing.