Skip to main content

Best Value Multimodal AI Model in 2026 — Cost-Adjusted Rankings

Multimodal workloads — processing images, charts, documents, and screenshots — often involve large inputs that drive up token costs quickly. This ranking divides each model's weighted multimodal score (MMMU-Pro, OfficeQA-Pro) by output token price. For document processing pipelines and visual AI applications running at scale, the value leaders here offer the best multimodal reasoning per dollar spent.

Unless noted otherwise, ranking surfaces on this page use BenchLM's provisional leaderboard lane rather than the stricter sourced-only verified leaderboard.

Bottom line: Multimodal inputs are large and expensive. Gemini 3.1 Flash-Lite dominates value with strong multimodal scores at the lowest price point.

According to BenchLM.ai, Gemini 3.1 Flash-Lite leads this ranking with a score of 156.75, followed by GPT-4.1 nano (96.55) and Gemini 2.5 Flash (88.27). There is a significant gap between the leading models and the rest of the field.

The best open-weight option is DeepSeek Coder 2.0 (ranked #6 with a score of 34.04). While proprietary models lead, open-weight options are within striking distance for teams willing to trade a few points of performance for full model control.

This ranking is based on provisional weighted averages across the scoring benchmarks in multimodalGrounded tracked by BenchLM.ai. For detailed model profiles, click any model name below. To compare two specific models head-to-head, use the "vs #" links.

What changed

Gemini 3.1 Flash-Lite leads multimodal value — best visual reasoning per dollar.

GPT-4.1 nano strong multimodal value in OpenAI's lineup.

Gemini 2.5 Flash good multimodal value with broader capabilities.

How to choose

Full Rankings (33 models)

Gemini 3.1 Flash-Lite
Google·Proprietary·1M

156.75

Score/$

Score: 62.7 · $0.4/1M

GPT-4.1 nano
OpenAI·Proprietary·1M

96.55

Score/$

Score: 38.6 · $0.4/1M

Gemini 2.5 Flash
Google·Proprietary·1M

88.27

Score/$

Score: 53 · $0.6/1M

4
GPT-4o mini
OpenAI·Proprietary·128K

65.47

Score/$

Score: 39.3 · $0.6/1M

5
GPT-4.1 mini
OpenAI·Proprietary·1M

35.49

Score/$

Score: 56.8 · $1.6/1M

6
DeepSeek Coder 2.0
DeepSeek·Open Weight·128K

34.04

Score/$

Score: 37.4 · $1.1/1M

7
Gemini 3 Flash
Google·Proprietary·1M

24.74

Score/$

Score: 74.2 · $3/1M

8
Kimi K2.5
Moonshot AI·Open Weight·128K

23.02

Score/$

Score: 64.5 · $2.8/1M

9
Gemini 3.1 Pro
Google·Proprietary·1M

18.08

Score/$

Score: 90.4 · $5/1M

10
Gemini 2.5 Pro
Google·Proprietary·1M

16.82

Score/$

Score: 84.1 · $5/1M

11
GPT-5.1
OpenAI·Proprietary·200K

15.97

Score/$

Score: 95.8 · $6/1M

12
GPT-5.4 mini
OpenAI·Proprietary·400K

15.56

Score/$

Score: 70 · $4.5/1M

13
o3-mini
OpenAI·Proprietary·200K

14.79

Score/$

Score: 65.1 · $4.4/1M

14
Claude Haiku 4.5
Anthropic·Proprietary·200K

14.39

Score/$

Score: 72 · $5/1M

15
Grok 4.20
xAI·Proprietary·2M

11.26

Score/$

Score: 67.6 · $6/1M

16
GPT-5.1-Codex-Max
OpenAI·Proprietary·400K

11.22

Score/$

Score: 89.8 · $8/1M

17
Mistral Large 3
Mistral·Proprietary·128K

11.16

Score/$

Score: 67 · $6/1M

18
GPT-5.2-Codex
OpenAI·Proprietary·400K

11.11

Score/$

Score: 88.9 · $8/1M

19
GPT-5.2
OpenAI·Proprietary·400K

10.78

Score/$

Score: 86.3 · $8/1M

20
GPT-5.3 Codex
OpenAI·Proprietary·400K

9.53

Score/$

Score: 95.3 · $10/1M

21
GPT-4.1
OpenAI·Proprietary·1M

7.99

Score/$

Score: 63.9 · $8/1M

22
DeepSeek-R1
DeepSeek·Open Weight·128K

7.98

Score/$

Score: 17.5 · $2.19/1M

23
Grok 4.1
xAI·Proprietary·1M

6.5

Score/$

Score: 97.5 · $15/1M

24
Claude Sonnet 4.6
Anthropic·Proprietary·200K

6.33

Score/$

Score: 95 · $15/1M

25
Claude Sonnet 4.5
Anthropic·Proprietary·200K

6.28

Score/$

Score: 94.2 · $15/1M

26
GPT-4o
OpenAI·Proprietary·128K

6.1

Score/$

Score: 61 · $10/1M

27
GPT-5.4
OpenAI·Proprietary·1.05M

5.86

Score/$

Score: 87.9 · $15/1M

28
Claude Opus 4.6
Anthropic·Proprietary·1M

3.37

Score/$

Score: 84.2 · $25/1M

29
o3
OpenAI·Proprietary·200K

1.54

Score/$

Score: 61.4 · $40/1M

30
o1
OpenAI·Proprietary·200K

0.98

Score/$

Score: 58.7 · $60/1M

31
Claude Mythos Preview
Anthropic·Proprietary·1M

0.78

Score/$

Score: 97.8 · $125/1M

32
GPT-5.4 Pro
OpenAI·Proprietary·1.05M

0.56

Score/$

Score: 100 · $180/1M

33
o1-pro
OpenAI·Proprietary·200K

0.03

Score/$

Score: 18.9 · $600/1M

These rankings update weekly

Get notified when models move. One email a week with what changed and why.

Free. No spam. Unsubscribe anytime.

Key Takeaways

The best value model is Gemini 3.1 Flash-Lite by Google with a provisional Score/$ ratio of 156.75 (score: 62.7, output: $0.4/1M tokens).

The best open-weight model is DeepSeek Coder 2.0 at position #6.

33 models are included in this ranking.

Score in Context

What these scores mean

Value scores divide the weighted multimodal score by output token price (per 1M tokens). Higher means more capability per dollar. Models with no listed price are excluded.

Known limitations

Value rankings favor cheap models even if absolute performance is modest. A model scoring half as well at one-tenth the price wins on value — but may not meet your quality bar. Always check raw scores alongside value rankings.

Last updated: April 16, 2026

The AI models change fast. We track them for you.

For engineers, researchers, and the plain curious — a weekly brief on new models, ranking shifts, and pricing changes.

Free. No spam. Unsubscribe anytime.