Best Value LLM Overall in 2026 — Cost-Adjusted Rankings

This ranking answers the simplest question: which model gives you the most benchmark performance per dollar? It divides each model's overall weighted score (across all 8 categories) by its output token price. The leaders here are generalist value picks — strong across coding, reasoning, agentic, knowledge, and more, at prices that won't break your API budget. Start here if you need a single cost-effective model for mixed workloads.

Unless noted otherwise, ranking surfaces on this page use BenchLM's provisional leaderboard lane rather than the stricter sourced-only verified leaderboard.

Bottom line: Gemini 3.1 Flash-Lite offers the most all-around capability per dollar. GPT-4o mini is a strong OpenAI alternative. For budget-constrained mixed workloads, start here.

According to BenchLM.ai, DeepSeek V4 Flash (Max) leads this ranking with a score of 271.43, followed by DeepSeek V4 Flash (High) (253.57) and DeepSeek V4 Flash (210.71). There is a significant gap between the leading models and the rest of the field.

The best open-weight option is DeepSeek V4 Flash (Max) (ranked #1 with a score of 271.43). Open-weight models are highly competitive in this category — self-hosting is a viable alternative to proprietary APIs.

This ranking is based on provisional overall weighted scores across BenchLM.ai's scoring formula tracked by BenchLM.ai. For detailed model profiles, click any model name below. To compare two specific models head-to-head, use the "vs #" links.

1Open

DeepSeek V4 Flash (Max)

DeepSeek · 1M

271.43Score/$

Score: 76 · $0.28/1M

2Open

DeepSeek V4 Flash (High)

DeepSeek · 1M

253.57Score/$

Score: 71 · $0.28/1M

3Open

DeepSeek V4 Flash

DeepSeek · 1M

210.71Score/$

Score: 59 · $0.28/1M

What changed

Gemini 3.1 Flash-Lite leads overall value — most benchmark performance per dollar across all categories.

GPT-4o mini best overall value in OpenAI's lineup.

GPT-4.1 nano strong value pick for reasoning-heavy mixed workloads.

How to choose

Best all-around value?

Gemini 3.1 Flash-Lite — highest value score overall

Value in OpenAI ecosystem?

GPT-4o mini — best OpenAI value

Category-specific value?

Check category-specific value pages (coding, agentic, etc.)

Best raw performance regardless of cost?

See the overall leaderboard

Full Rankings (66 models)

DeepSeek V4 Flash (Max)

DeepSeek·Open Weight·1M

271.43

Score/$

Score: 76 · $0.28/1M

vs #2

DeepSeek V4 Flash (High)

DeepSeek·Open Weight·1M

253.57

Score/$

Score: 71 · $0.28/1M

vs #3

DeepSeek V4 Flash

DeepSeek·Open Weight·1M

210.71

Score/$

Score: 59 · $0.28/1M

vs #4

Grok 4.1 Fast

xAI·Proprietary·1M

140

Score/$

Score: 70 · $0.5/1M

vs #5

DeepSeek V3.2

DeepSeek·Open Weight·128K

138.1

Score/$

Score: 58 · $0.42/1M

vs #6

GPT-4o mini

OpenAI·Proprietary·128K

83.33

Score/$

Score: 50 · $0.6/1M

vs #7

GPT-4.1 nano

OpenAI·Proprietary·1M

67.5

Score/$

Score: 27 · $0.4/1M

vs #8

MiniMax M2.7

MiniMax·Open Weight·200K

51.67

Score/$

Score: 62 · $1.2/1M

vs #9

DeepSeek V3

DeepSeek·Open Weight·128K

32.73

Score/$

Score: 36 · $1.1/1M

vs #10

Mistral Large 3

Mistral·Proprietary·128K

32.67

Score/$

Score: 49 · $1.5/1M

vs #11

Gemini 3.1 Flash-Lite

Google·Proprietary·1M

Score/$

Score: 48 · $1.5/1M

vs #12

DeepSeek V3.2 (Thinking)

DeepSeek·Open Weight·128K

28.31

Score/$

Score: 62 · $2.19/1M

vs #13

GPT-4.1 mini

OpenAI·Proprietary·1M

28.13

Score/$

Score: 45 · $1.6/1M

vs #14

Grok Code Fast 1

xAI·Proprietary·256K

26.67

Score/$

Score: 40 · $1.5/1M

vs #15

GLM-5 (Reasoning)

Z.AI·Open Weight·200K

25.63

Score/$

Score: 82 · $3.2/1M

vs #16

Kimi K2.5 (Reasoning)

Moonshot AI·Proprietary·128K

25.33

Score/$

Score: 76 · $3/1M

vs #17

DeepSeek V4 Pro (Max)

DeepSeek·Open Weight·1M

25.29

Score/$

Score: 88 · $3.48/1M

vs #18

DeepSeek V4 Pro (High)

DeepSeek·Open Weight·1M

24.14

Score/$

Score: 84 · $3.48/1M

vs #19

Qwen3.5 397B (Reasoning)

Alibaba·Open Weight·128K

21.94

Score/$

Score: 79 · $3.6/1M

vs #20

Gemini 3 Flash

Google·Proprietary·1M

21.67

Score/$

Score: 65 · $3/1M

vs #21

Kimi K2.5

Moonshot AI·Open Weight·256K

21.33

Score/$

Score: 64 · $3/1M

vs #22

Kimi K2.6

Moonshot AI·Open Weight·256K

21.25

Score/$

Score: 85 · $4/1M

vs #23

GLM-5

Z.AI·Open Weight·200K

20.94

Score/$

Score: 67 · $3.2/1M

vs #24

DeepSeek V4 Pro

DeepSeek·Open Weight·1M

20.11

Score/$

Score: 70 · $3.48/1M

vs #25

Claude 3 Haiku

Anthropic·Proprietary·200K

19.2

Score/$

Score: 24 · $1.25/1M

vs #26

GLM-5.1

Z.AI·Open Weight·203K

18.86

Score/$

Score: 83 · $4.4/1M

vs #27

Qwen3.5 397B

Alibaba·Open Weight·128K

17.78

Score/$

Score: 64 · $3.6/1M

vs #28

GLM-4.5-Air

Z.AI·Proprietary·128K

17.27

Score/$

Score: 19 · $1.1/1M

vs #29

Kimi K2

Moonshot AI·Proprietary·128K

16.8

Score/$

Score: 42 · $2.5/1M

vs #30

Gemini 2.5 Flash

Google·Proprietary·1M

15.2

Score/$

Score: 38 · $2.5/1M

vs #31

DeepSeek-R1

DeepSeek·Open Weight·128K

15.07

Score/$

Score: 33 · $2.19/1M

vs #32

o3-mini

OpenAI·Proprietary·200K

12.73

Score/$

Score: 56 · $4.4/1M

vs #33

GLM-4.5

Z.AI·Proprietary·128K

12.27

Score/$

Score: 27 · $2.2/1M

vs #34

Claude Haiku 4.5

Anthropic·Proprietary·200K

11.6

Score/$

Score: 58 · $5/1M

vs #35

Grok 4.20

xAI·Proprietary·2M

10.83

Score/$

Score: 65 · $6/1M

vs #36

GPT-5.1

OpenAI·Proprietary·200K

7.9

Score/$

Score: 79 · $10/1M

vs #37

GPT-5 (high)

OpenAI·Proprietary·128K

7.8

Score/$

Score: 78 · $10/1M

vs #38

Gemini 3.1 Pro

Google·Proprietary·1M

7.67

Score/$

Score: 92 · $12/1M

vs #39

GPT-5.1-Codex-Max

OpenAI·Proprietary·400K

7.6

Score/$

Score: 76 · $10/1M

vs #40

GPT-4.1

OpenAI·Proprietary·1M

7.25

Score/$

Score: 58 · $8/1M

vs #41

OpenAI·Proprietary·200K

7.25

Score/$

Score: 58 · $8/1M

vs #42

Gemini 1.5 Pro

Google·Proprietary·2M

7.2

Score/$

Score: 36 · $5/1M

vs #43

Gemini 3 Pro

Google·Proprietary·2M

6.75

Score/$

Score: 81 · $12/1M

vs #44

Gemini 2.5 Pro

Google·Proprietary·1M

6.5

Score/$

Score: 65 · $10/1M

vs #45

GPT-5.3 Codex

OpenAI·Proprietary·400K

6.21

Score/$

Score: 87 · $14/1M

vs #46

GPT-5.4

OpenAI·Proprietary·1.05M

5.93

Score/$

Score: 89 · $15/1M

vs #47

GPT-5.2

OpenAI·Proprietary·400K

5.79

Score/$

Score: 81 · $14/1M

vs #48

Claude Sonnet 4.6

Anthropic·Proprietary·200K

5.53

Score/$

Score: 83 · $15/1M

vs #49

GPT-5.2-Codex

OpenAI·Proprietary·400K

5.5

Score/$

Score: 77 · $14/1M

vs #50

Claude Sonnet 4.5

Anthropic·Proprietary·200K

4.4

Score/$

Score: 66 · $15/1M

vs #51

GPT-4o

OpenAI·Proprietary·128K

4.3

Score/$

Score: 43 · $10/1M

vs #52

Claude Opus 4.7 (Adaptive)

Anthropic·Proprietary·1M

3.6

Score/$

Score: 90 · $25/1M

vs #53

Claude Opus 4.6

Anthropic·Proprietary·1M

3.48

Score/$

Score: 87 · $25/1M

vs #54

Claude 4 Sonnet

Anthropic·Proprietary·200K

3.4

Score/$

Score: 51 · $15/1M

vs #55

Claude Opus 4.5

Anthropic·Proprietary·200K

3.08

Score/$

Score: 77 · $25/1M

vs #56

GPT-5.5

OpenAI·Proprietary·1M

3.03

Score/$

Score: 91 · $30/1M

vs #57

Claude 3.5 Sonnet

Anthropic·Proprietary·200K

2.73

Score/$

Score: 41 · $15/1M

vs #58

o1-preview

OpenAI·Proprietary·200K

1.38

Score/$

Score: 83 · $60/1M

vs #59

OpenAI·Proprietary·200K

0.95

Score/$

Score: 57 · $60/1M

vs #60

GPT-4 Turbo

OpenAI·Proprietary·128K

0.83

Score/$

Score: 25 · $30/1M

vs #61

Claude Mythos Preview

Anthropic·Proprietary·1M

0.79

Score/$

Score: 99 · $125/1M

vs #62

o3-pro

OpenAI·Proprietary·200K

0.73

Score/$

Score: 58 · $80/1M

vs #63

Claude 4.1 Opus

Anthropic·Proprietary·200K

0.69

Score/$

Score: 52 · $75/1M

vs #64

GPT-5.4 Pro

OpenAI·Proprietary·1.05M

0.51

Score/$

Score: 91 · $180/1M

vs #65

Claude 3 Opus

Anthropic·Proprietary·200K

0.47

Score/$

Score: 35 · $75/1M

vs #66

o1-pro

OpenAI·Proprietary·200K

0.05

Score/$

Score: 29 · $600/1M

These rankings update weekly

Get notified when models move. One email a week with what changed and why.

Free. No spam. Unsubscribe anytime.

Key Takeaways

The best value model is DeepSeek V4 Flash (Max) by DeepSeek with a provisional Score/$ ratio of 271.43 (score: 76, output: $0.28/1M tokens).

The best open-weight model is DeepSeek V4 Flash (Max) at position #1.

66 models are included in this ranking.

Score in Context

What these scores mean

Value scores divide the weighted overall score by output token price (per 1M tokens). Higher means more capability per dollar. Models with no listed price are excluded.

Known limitations

Value rankings favor cheap models even if absolute performance is modest. A model scoring half as well at one-tenth the price wins on value — but may not meet your quality bar. Always check raw scores alongside value rankings.

Explore More

Price vs Performance Chart Compare Pricing Which LLM Should I Use? Benchmark Explainers

Last updated: May 13, 2026

The AI models change fast. We track them for you.

For engineers, researchers, and the plain curious — a weekly brief on new models, ranking shifts, and pricing changes.

Free. No spam. Unsubscribe anytime.