Best Value Multimodal AI Model in 2026 — Cost-Adjusted Rankings

Multimodal workloads — processing images, charts, documents, and screenshots — often involve large inputs that drive up token costs quickly. This ranking divides each model's weighted multimodal score (MMMU-Pro, OfficeQA-Pro) by output token price. For document processing pipelines and visual AI applications running at scale, the value leaders here offer the best multimodal reasoning per dollar spent.

Unless noted otherwise, ranking surfaces on this page use BenchLM's provisional leaderboard lane rather than the stricter sourced-only verified leaderboard.

Bottom line: Multimodal inputs are large and expensive. Gemini 3.1 Flash-Lite dominates value with strong multimodal scores at the lowest price point.

According to BenchLM.ai, Grok 4.1 Fast leads this ranking with a score of 177.5, followed by DeepSeek V3.2 (118.01) and GPT-4.1 nano (93.97). There is a significant gap between the leading models and the rest of the field.

The best open-weight option is DeepSeek V3.2 (ranked #2 with a score of 118.01). Open-weight models are highly competitive in this category — self-hosting is a viable alternative to proprietary APIs.

This ranking is based on provisional weighted averages across the scoring benchmarks in multimodalGrounded tracked by BenchLM.ai. For detailed model profiles, click any model name below. To compare two specific models head-to-head, use the "vs #" links.

1Closed

Grok 4.1 Fast

xAI · 1M

177.5Score/$

Score: 88.7 · $0.5/1M

2Open

DeepSeek V3.2

DeepSeek · 128K

118.01Score/$

Score: 49.6 · $0.42/1M

3Closed

GPT-4.1 nano

OpenAI · 1M

93.97Score/$

Score: 37.6 · $0.4/1M

Good multimodal value in OpenAI ecosystem.

What changed

Gemini 3.1 Flash-Lite leads multimodal value — best visual reasoning per dollar.

GPT-4.1 nano strong multimodal value in OpenAI's lineup.

Gemini 2.5 Flash good multimodal value with broader capabilities.

How to choose

Document processing at scale?

Gemini 3.1 Flash-Lite — most multimodal per dollar

Multimodal in OpenAI ecosystem?

GPT-4.1 nano — best OpenAI multimodal value

Best raw multimodal regardless of cost?

See the standard multimodal leaderboard

Full Rankings (56 models)

Grok 4.1 Fast

xAI·Proprietary·1M

177.5

Score/$

Score: 88.7 · $0.5/1M

vs #2

DeepSeek V3.2

DeepSeek·Open Weight·128K

118.01

Score/$

Score: 49.6 · $0.42/1M

vs #3

GPT-4.1 nano

OpenAI·Proprietary·1M

93.97

Score/$

Score: 37.6 · $0.4/1M

vs #4

GPT-4o mini

OpenAI·Proprietary·128K

67.77

Score/$

Score: 40.7 · $0.6/1M

vs #5

Mistral Large 3

Mistral·Proprietary·128K

44.67

Score/$

Score: 67 · $1.5/1M

vs #6

Claude 3 Haiku

Anthropic·Proprietary·200K

44.14

Score/$

Score: 55.2 · $1.25/1M

vs #7

Gemini 3.1 Flash-Lite

Google·Proprietary·1M

38.9

Score/$

Score: 58.4 · $1.5/1M

vs #8

GPT-4.1 mini

OpenAI·Proprietary·1M

35.14

Score/$

Score: 56.2 · $1.6/1M

vs #9

DeepSeek V3.2 (Thinking)

DeepSeek·Open Weight·128K

26.69

Score/$

Score: 58.4 · $2.19/1M

vs #10

Gemini 3 Flash

Google·Proprietary·1M

24.1

Score/$

Score: 72.3 · $3/1M

vs #11

Kimi K2.5 (Reasoning)

Moonshot AI·Proprietary·128K

22.65

Score/$

Score: 68 · $3/1M

vs #12

GLM-5 (Reasoning)

Z.AI·Open Weight·200K

22.47

Score/$

Score: 71.9 · $3.2/1M

vs #13

Gemini 2.5 Flash

Google·Proprietary·1M

21.36

Score/$

Score: 53.4 · $2.5/1M

vs #14

Kimi K2.5

Moonshot AI·Open Weight·256K

21.09

Score/$

Score: 63.3 · $3/1M

vs #15

Qwen3.5 397B

Alibaba·Open Weight·128K

18.17

Score/$

Score: 65.4 · $3.6/1M

vs #16

Kimi K2.6

Moonshot AI·Open Weight·256K

17.93

Score/$

Score: 71.7 · $4/1M

vs #17

GLM-5

Z.AI·Open Weight·200K

17.34

Score/$

Score: 55.5 · $3.2/1M

vs #18

Qwen3.5 397B (Reasoning)

Alibaba·Open Weight·128K

16.07

Score/$

Score: 57.9 · $3.6/1M

vs #19

o3-mini

OpenAI·Proprietary·200K

14.76

Score/$

Score: 64.9 · $4.4/1M

vs #20

Claude Haiku 4.5

Anthropic·Proprietary·200K

14.55

Score/$

Score: 72.8 · $5/1M

vs #21

Grok Code Fast 1

xAI·Proprietary·256K

14.12

Score/$

Score: 21.2 · $1.5/1M

vs #22

Gemini 1.5 Pro

Google·Proprietary·2M

12.96

Score/$

Score: 64.8 · $5/1M

vs #23

GPT-5.1

OpenAI·Proprietary·200K

9.63

Score/$

Score: 96.3 · $10/1M

vs #24

GPT-5 (high)

OpenAI·Proprietary·128K

9.23

Score/$

Score: 92.3 · $10/1M

vs #25

Gemini 3.5 Flash

Google·Proprietary·1M

8.96

Score/$

Score: 80.6 · $9/1M

vs #26

GPT-5.1-Codex-Max

OpenAI·Proprietary·400K

8.92

Score/$

Score: 89.2 · $10/1M

vs #27

Gemini 2.5 Pro

Google·Proprietary·1M

8.43

Score/$

Score: 84.3 · $10/1M

vs #28

GPT-4.1

OpenAI·Proprietary·1M

7.92

Score/$

Score: 63.3 · $8/1M

vs #29

Grok 4.20

xAI·Proprietary·2M

7.8

Score/$

Score: 46.8 · $6/1M

vs #30

DeepSeek-R1

DeepSeek·Open Weight·128K

7.7

Score/$

Score: 16.9 · $2.19/1M

vs #31

OpenAI·Proprietary·200K

7.64

Score/$

Score: 61.1 · $8/1M

vs #32

Gemini 3.1 Pro

Google·Proprietary·1M

7.02

Score/$

Score: 84.2 · $12/1M

vs #33

Gemini 3 Pro

Google·Proprietary·2M

6.79

Score/$

Score: 81.5 · $12/1M

vs #34

GPT-5.3 Codex

OpenAI·Proprietary·400K

6.77

Score/$

Score: 94.8 · $14/1M

vs #35

Claude Sonnet 4.5

Anthropic·Proprietary·200K

6.32

Score/$

Score: 94.8 · $15/1M

vs #36

GPT-5.2-Codex

OpenAI·Proprietary·400K

6.3

Score/$

Score: 88.2 · $14/1M

vs #37

GPT-4o

OpenAI·Proprietary·128K

6.15

Score/$

Score: 61.5 · $10/1M

vs #38

GPT-5.2

OpenAI·Proprietary·400K

5.85

Score/$

Score: 81.9 · $14/1M

vs #39

Claude Sonnet 4.6

Anthropic·Proprietary·200K

5.8

Score/$

Score: 87 · $15/1M

vs #40

Claude 4 Sonnet

Anthropic·Proprietary·200K

4.98

Score/$

Score: 74.7 · $15/1M

vs #41

Claude 3.5 Sonnet

Anthropic·Proprietary·200K

4.41

Score/$

Score: 66.1 · $15/1M

vs #42

GPT-5.4

OpenAI·Proprietary·1.05M

Score/$

Score: 60 · $15/1M

vs #43

Claude Opus 4.6

Anthropic·Proprietary·1M

3.07

Score/$

Score: 76.8 · $25/1M

vs #44

GLM-4.5-Air

Z.AI·Proprietary·128K

2.69

Score/$

Score: 3 · $1.1/1M

vs #45

Claude Opus 4.5

Anthropic·Proprietary·200K

2.44

Score/$

Score: 61.1 · $25/1M

vs #46

GLM-4.5

Z.AI·Proprietary·128K

2.36

Score/$

Score: 5.2 · $2.2/1M

vs #47

GPT-5.5

OpenAI·Proprietary·1M

1.91

Score/$

Score: 57.2 · $30/1M

vs #48

Claude Opus 4.7 (Adaptive)

Anthropic·Proprietary·1M

1.9

Score/$

Score: 47.5 · $25/1M

vs #49

o1-preview

OpenAI·Proprietary·200K

1.11

Score/$

Score: 66.9 · $60/1M

vs #50

GPT-4 Turbo

OpenAI·Proprietary·128K

1.03

Score/$

Score: 30.9 · $30/1M

vs #51

Claude 4.1 Opus

Anthropic·Proprietary·200K

1.02

Score/$

Score: 76.5 · $75/1M

vs #52

OpenAI·Proprietary·200K

0.97

Score/$

Score: 58.3 · $60/1M

vs #53

o3-pro

OpenAI·Proprietary·200K

0.8

Score/$

Score: 64.1 · $80/1M

vs #54

Claude Mythos Preview

Anthropic·Proprietary·1M

0.78

Score/$

Score: 97.6 · $125/1M

vs #55

Claude 3 Opus

Anthropic·Proprietary·200K

0.78

Score/$

Score: 58.3 · $75/1M

vs #56

o1-pro

OpenAI·Proprietary·200K

0.03

Score/$

Score: 19.1 · $600/1M

These rankings update weekly

Get notified when models move. One email a week with what changed and why.

Free. No spam. Unsubscribe anytime.

Key Takeaways

The best value model is Grok 4.1 Fast by xAI with a provisional Score/$ ratio of 177.5 (score: 88.7, output: $0.5/1M tokens).

The best open-weight model is DeepSeek V3.2 at position #2.

56 models are included in this ranking.

Score in Context

What these scores mean

Value scores divide the weighted multimodal score by output token price (per 1M tokens). Higher means more capability per dollar. Models with no listed price are excluded.

Known limitations

Value rankings favor cheap models even if absolute performance is modest. A model scoring half as well at one-tenth the price wins on value — but may not meet your quality bar. Always check raw scores alongside value rankings.

Explore More

Price vs Performance Chart Compare Pricing Which LLM Should I Use? Benchmark Explainers

Last updated: May 20, 2026

The AI models change fast. We track them for you.

For engineers, researchers, and the plain curious — a weekly brief on new models, ranking shifts, and pricing changes.

Free. No spam. Unsubscribe anytime.