Gemma 4 E4B Benchmark Scores & Performance

BenchLM is tracking Gemma 4 E4B by Google. Some benchmark data is visible, but not enough non-generated coverage is available for a leaderboard rank yet.

BenchLM is tracking Gemma 4 E4B, but this profile is currently excluded from the public leaderboard because it still lacks enough verified benchmark coverage to rank safely. Only verified public benchmark rows appear below.

Gemma 4 E4B is a open weight model with a 128K token context window. It uses explicit chain-of-thought reasoning, which typically improves performance on math and complex reasoning tasks at the cost of higher latency and token usage.

Gemma 4 E4B sits inside the Gemma 4 family alongside Gemma 4 31B, Gemma 4 26B A4B, Gemma 4 E2B. This profile currently has 6 of 125 tracked benchmarks. BenchLM only exposes verified benchmark rows publicly, so missing categories stay blank until a sourced evaluation is available.

Its strongest category is Multimodal & Grounded (#76). This performance profile makes it particularly strong for screenshots, documents, charts, and grounded multimodal workflows.

Provider

Google

Source Type

Open Weight

Reasoning

Reasoning

Context Window

128K

Model Status

Current

Release Date

Apr 2, 2026

Overall Score

Unranked

Pricing

$0.00 / $0.00

Input / output per 1M

Runtime

N/A

Latency unavailable

Family & Lineage

Family

Gemma 4

E4b

Canonical Entry

Gemma 4 31B

Rankings Overview

BenchLM is still missing enough verified benchmark coverage to rank this model across the public leaderboard. Only verified public benchmark rows are shown below.

Knowledge Benchmarks

GPQARefreshingDetails
58.6%

GPQA Diamond · Static refresh · updated April 2, 2026

MMLU-ProRefreshingDetails
69.4%

MMLU-Pro · Static refresh · updated April 2, 2026

Coding Benchmarks

LiveCodeBenchCurrentDetails
52%

Rolling 2026 set · Rolling refresh · updated April 2, 2026

Reasoning Benchmarks

BBHStaleSaturatedDisplay onlyDetails
33.1%

BBH 2022 · Static refresh · updated April 2, 2026

MRCRv2CurrentDetails
25.4%

MRCRv2 2025 · Quarterly refresh · updated April 2, 2026

Multimodal & Grounded Benchmarks

MMMU-ProRefreshingDetails
52.6%

MMMU-Pro 2024 · Annual refresh · updated April 2, 2026

Frequently Asked Questions

How does Gemma 4 E4B perform overall in AI benchmarks?

Gemma 4 E4B has 6 verified benchmark scores on BenchLM, but it does not yet have enough coverage to receive a global overall rank.

Is Gemma 4 E4B good for knowledge and understanding?

Gemma 4 E4B has visible benchmark coverage in knowledge and understanding, but BenchLM does not currently assign it a global category rank there.

Is Gemma 4 E4B good for coding and programming?

Gemma 4 E4B has visible benchmark coverage in coding and programming, but BenchLM does not currently assign it a global category rank there.

Is Gemma 4 E4B good for reasoning and logic?

Gemma 4 E4B has visible benchmark coverage in reasoning and logic, but BenchLM does not currently assign it a global category rank there.

Is Gemma 4 E4B good for multimodal and grounded tasks?

Gemma 4 E4B ranks #76 out of 103 models in multimodal and grounded tasks benchmarks with an average score of 52.6. There are stronger options in this category.

Is Gemma 4 E4B open source?

Yes, Gemma 4 E4B is an open weight model created by Google, meaning it can be downloaded and run locally or fine-tuned for specific use cases.

Which sibling models are related to Gemma 4 E4B?

Gemma 4 E4B belongs to the Gemma 4 family. Related variants on BenchLM include Gemma 4 31B, Gemma 4 26B A4B, Gemma 4 E2B.

Does Gemma 4 E4B have full benchmark coverage on BenchLM?

Not yet. Gemma 4 E4B currently has 6 verified benchmark scores out of the 125 benchmarks BenchLM tracks. BenchLM only exposes verified public benchmark rows, so missing categories stay blank until a sourced evaluation is available.

What is the context window size of Gemma 4 E4B?

Gemma 4 E4B has a context window of 128K, which determines how much text it can process in a single interaction.

Last updated: April 2, 2026 · Runtime metrics stay blank until BenchLM has a sourced snapshot.

Weekly LLM Updates

New model releases, benchmark scores, and leaderboard changes. Every Friday.

Free. Your signup is stored with a derived country code for compliance routing.