Gemma 4 26B A4B Benchmark Scores & Performance

Benchmark analysis of Gemma 4 26B A4B by Google across 8 sourced tests on BenchLM.

According to BenchLM.ai, Gemma 4 26B A4B ranks #43 out of 103 models with an overall score of 64/100. While not a frontier model, it offers specific advantages depending on the use case.

Gemma 4 26B A4B is a open weight model with a 256K token context window. It uses explicit chain-of-thought reasoning, which typically improves performance on math and complex reasoning tasks at the cost of higher latency and token usage.

Gemma 4 26B A4B sits inside the Gemma 4 family alongside Gemma 4 31B, Gemma 4 E4B, Gemma 4 E2B. This profile currently has 8 of 125 tracked benchmarks. BenchLM only exposes verified benchmark rows publicly, so missing categories stay blank until a sourced evaluation is available.

Its strongest category is Multimodal & Grounded (#42), while its weakest is Knowledge (#49). This performance profile makes it particularly strong for screenshots, documents, charts, and grounded multimodal workflows.

Provider

Google

Source Type

Open Weight

Reasoning

Reasoning

Context Window

256K

Model Status

Current

Release Date

Apr 2, 2026

Overall Score

64#43 of 103

Pricing

$0.00 / $0.00

Input / output per 1M

Runtime

N/A

Latency unavailable

Arena Elo

1440.64

Text Overall

Arena Text Profile

Human-preference results from LM Arena text leaderboards. These are displayed separately from BenchLM benchmark scoring.

Text Overall

1440.64

±8.59 · 4,548 votes

Coding

1487.94

±17.55 · 1,044 votes

Math

1469.98

±33.66 · 267 votes

Instruction Following

1446.18

±16.2 · 1,204 votes

Creative Writing

1404.8

±21.49 · 750 votes

Multi-turn

1456.52

±20.07 · 811 votes

Hard Prompts

1463.97

±11.54 · 2,478 votes

Hard Prompts (English)

1473.34

±16.96 · 1,136 votes

Longer Query

1455.9

±16.45 · 1,222 votes

Family & Lineage

Family

Gemma 4

26b-a4b

Canonical Entry

Gemma 4 31B

Knowledge Benchmarks

GPQARefreshingDetails
82.3%

GPQA Diamond · Static refresh · updated April 2, 2026

MMLU-ProRefreshingDetails
82.6%

MMLU-Pro · Static refresh · updated April 2, 2026

HLECurrentDetails
17.2%

Humanity's Last Exam · Static refresh · updated April 2, 2026

HLE w/o toolsCurrentDisplay onlyDetails
8.7%

HLE w/o tools 2026 · Quarterly refresh · updated April 2, 2026

Coding Benchmarks

LiveCodeBenchCurrentDetails
77.1%

Rolling 2026 set · Rolling refresh · updated April 2, 2026

Reasoning Benchmarks

BBHStaleSaturatedDisplay onlyDetails
64.8%

BBH 2022 · Static refresh · updated April 2, 2026

MRCRv2CurrentDetails
44.1%

MRCRv2 2025 · Quarterly refresh · updated April 2, 2026

Multimodal & Grounded Benchmarks

MMMU-ProRefreshingDetails
73.8%

MMMU-Pro 2024 · Annual refresh · updated April 2, 2026

Frequently Asked Questions

How does Gemma 4 26B A4B perform overall in AI benchmarks?

Gemma 4 26B A4B ranks #43 out of 103 models with an overall score of 64. It is created by Google and features a 256K context window.

Is Gemma 4 26B A4B good for knowledge and understanding?

Gemma 4 26B A4B ranks #49 out of 103 models in knowledge and understanding benchmarks with an average score of 56.1. There are stronger options in this category.

Is Gemma 4 26B A4B good for coding and programming?

Gemma 4 26B A4B has visible benchmark coverage in coding and programming, but BenchLM does not currently assign it a global category rank there.

Is Gemma 4 26B A4B good for reasoning and logic?

Gemma 4 26B A4B has visible benchmark coverage in reasoning and logic, but BenchLM does not currently assign it a global category rank there.

Is Gemma 4 26B A4B good for multimodal and grounded tasks?

Gemma 4 26B A4B ranks #42 out of 103 models in multimodal and grounded tasks benchmarks with an average score of 73.8. There are stronger options in this category.

Is Gemma 4 26B A4B open source?

Yes, Gemma 4 26B A4B is an open weight model created by Google, meaning it can be downloaded and run locally or fine-tuned for specific use cases.

Which sibling models are related to Gemma 4 26B A4B?

Gemma 4 26B A4B belongs to the Gemma 4 family. Related variants on BenchLM include Gemma 4 31B, Gemma 4 E4B, Gemma 4 E2B.

Does Gemma 4 26B A4B have full benchmark coverage on BenchLM?

Not yet. Gemma 4 26B A4B currently has 8 verified benchmark scores out of the 125 benchmarks BenchLM tracks. BenchLM only exposes verified public benchmark rows, so missing categories stay blank until a sourced evaluation is available.

What is the context window size of Gemma 4 26B A4B?

Gemma 4 26B A4B has a context window of 256K, which determines how much text it can process in a single interaction.

Last updated: April 2, 2026 · Runtime metrics stay blank until BenchLM has a sourced snapshot.

Weekly LLM Updates

New model releases, benchmark scores, and leaderboard changes. Every Friday.

Free. Your signup is stored with a derived country code for compliance routing.