LLM API Pricing Comparison 2026

Compare API pricing for every major LLM side by side. All prices are per million tokens. The Score/$column shows benchmark performance per dollar of output — higher means better value.

Last updated: April 2, 2026. Prices reflect official API rates. Open-weight model costs depend on your infrastructure. How does LLM token pricing work?

Cheapest API

LFM2-24B-A2B

$0.03 in / $0.12 out per 1M tokens

Best Value (Score/$)

Gemini 3.1 Flash-Lite

Score/$ ratio: 135.0

Highest Scored (Paid)

GPT-5.4 Pro

Score: 91 — $30 in / $180 out

Model
GPT-5.4 Pro

OpenAI

GPT-5.4 family · pro

$30.00$180.001.05M920.5
GPT-5.1-Codex-Max

OpenAI

$2.00$8.00400K8110.6
GPT-5.2-Codex

OpenAI

$2.00$8.00400K8210.5
GPT-5.4

OpenAI

GPT-5.4 family · base

$2.50$15.001.05M825.6
Grok 4.1

xAI

$3.00$15.001M855.6
Gemini 3.1 Pro

Google

$1.25$5.001M8716.6
GPT-5.3 Codex

OpenAI

$2.50$10.00400K858.3
Claude Opus 4.6

Anthropic

$15.00$75.001M851.1
Gemini 3 Pro Deep Think

Google

Gemini 3 Pro family · reasoning

N/AN/A2M80
GPT-5 (high)

OpenAI

GPT-5 family · reasoning

N/AN/A128K82
GPT-5.2

OpenAI

GPT-5.2 family · thinking

$2.00$8.00400K829.9
GLM-5 (Reasoning)

Zhipu AI

GLM-5 family · reasoning

Free*Free*200K82
GPT-5 (medium)

OpenAI

GPT-5 family · reasoning

N/AN/A128K76
GPT-5.1

OpenAI

$1.50$6.00200K7812.8
Grok 4.1 Fast

xAI

N/AN/A1M70
o1-preview

OpenAI

o1 family · snapshot

N/AN/A200K72
Qwen3.5 397B (Reasoning)

Alibaba

Qwen3.5 397B family · reasoning

Free*Free*128K77
Gemini 3 Pro

Google

Gemini 3 Pro family · base

N/AN/A2M79
Claude Sonnet 4.6

Anthropic

$3.00$15.00200K844.9
Kimi K2.5 (Reasoning)

Moonshot AI

Kimi K2.5 family · reasoning

N/AN/A128K76
Claude Opus 4.5

Anthropic

N/AN/A200K76
Gemma 4 31B

Google

Gemma 4 family · 31b

Free*Free*256K73
Claude Sonnet 4.5

Anthropic

$3.00$15.00200K684.7
o3-pro

OpenAI

o3 family · pro

N/AN/A200K67
Qwen3.5-122B-A10B

Alibaba

Free*Free*262K71
MiMo-V2-Flash

Xiaomi

MiMo-V2-Flash family · base

Free*Free*256K67
o3-mini

OpenAI

o3 family · mini

$1.10$4.40200K6515.9
Qwen3.5-27B

Alibaba

Free*Free*262K70
GLM-4.7

Zhipu AI

Free*Free*200K74
Kimi K2.5

Moonshot AI

$0.50$2.80128K7224.6
Qwen3.6 Plus

Alibaba

Free*Free*1M69
GLM-5

Zhipu AI

GLM-5 family · base

Free*Free*200K75
o3

OpenAI

o3 family · base

$10.00$40.00200K641.7
Qwen3.5-35B-A3B

Alibaba

Qwen3.5-35B-A3B family · base

Free*Free*262K66
GPT-4.1

OpenAI

GPT-4.1 family · base

$2.00$8.001M648.4
Grok 4

xAI

N/AN/A128K68
o1

OpenAI

o1 family · base

$15.00$60.00200K641.1
Qwen2.5-1M

Alibaba

Free*Free*1M62
Qwen3.5 397B

Alibaba

Free*Free*128K68
DeepSeek Coder 2.0

DeepSeek

$0.27$1.10128K6260.0
DeepSeek V3.2 (Thinking)

DeepSeek

DeepSeek V3.2 family · reasoning

Free*Free*128K67
DeepSeekMath V2

DeepSeek

DeepSeekMath family · snapshot

Free*Free*128K63
Claude 4 Sonnet

Anthropic

N/AN/A200K62
Gemini 2.5 Pro

Google

$1.25$5.001M6513.0
Nemotron 3 Ultra 500B

NVIDIA

Free*Free*10M60
Claude 4.1 Opus

Anthropic

Claude 4.1 Opus family · base

N/AN/A200K62
Gemini 3 Flash

Google

$0.50$3.001M6721.3
Gemma 4 26B A4B

Google

Gemma 4 family · 26b-a4b

Free*Free*256K64
Qwen2.5-72B

Alibaba

Free*Free*128K60
Claude Haiku 4.5

Anthropic

$0.80$4.00200K6315.5
DeepSeek LLM 2.0

DeepSeek

Free*Free*128K57
DeepSeek V3.2

DeepSeek

DeepSeek V3.2 family · base

Free*Free*128K61
o4-mini (high)

OpenAI

o4-mini family · reasoning

N/AN/A200K58
Claude 3.5 Sonnet

Anthropic

N/AN/A200K55
GPT-5.4 mini

OpenAI

GPT-5.4 family · mini

$0.75$4.50400K6613.3
Grok Code Fast 1

xAI

N/AN/A256K56
Kimi K2

Moonshot AI

N/AN/A128K53
Mistral Large 3

Mistral

$2.00$6.00128K5810.0
Nemotron 3 Super 100B

NVIDIA

Free*Free*1M56
GPT-4.1 mini

OpenAI

GPT-4.1 family · mini

$0.40$1.601M5736.9
Llama 3.1 405B

Meta

Free*Free*128K53
Mistral Large 2

Mistral

N/AN/A128K52
Claude 4.1 Opus Thinking

Anthropic

Claude 4.1 Opus family · reasoning

N/AN/A200K57
GPT-4o mini

OpenAI

GPT-4o family · mini

$0.15$0.60128K5495.0
GPT-4o

OpenAI

GPT-4o family · base

$2.50$10.00128K505.5
DeepSeek V3

DeepSeek

DeepSeek family · snapshot

$0.27$1.10128K4949.1
Gemini 3.1 Flash-Lite

Google

$0.10$0.401M56135.0
Gemini 1.5 Pro

Google

N/AN/A2M50
Qwen3 235B 2507 (Reasoning)

Alibaba

Qwen3 235B 2507 family · reasoning

Free*Free*128K55
GPT-4.1 nano

OpenAI

GPT-4.1 family · nano

$0.10$0.401M44127.5
Claude 3 Opus

Anthropic

N/AN/A200K49
Gemini 2.5 Flash

Google

$0.15$0.601M5083.3
Claude 3 Haiku

Anthropic

N/AN/A200K43
GPT-5.4 nano

OpenAI

GPT-5.4 family · nano

$0.20$1.25400K5839.2
GPT-OSS 120B

OpenAI

GPT-OSS family · base

Free*Free*128K50
Phi-4

Microsoft

Free*Free*16K40
Mistral 8x7B

Mistral

Mistral 8x7B family · base

Free*Free*32K44
Nemotron Ultra 253B

NVIDIA

Free*Free*32K41
GPT-4 Turbo

OpenAI

N/AN/A128K43
Moonshot v1

Moonshot AI

Moonshot family · snapshot

N/AN/A128K43
Z-1

Z

N/AN/A128K44
Gemini 1.0 Pro

Google

N/AN/A32K40
Nemotron-4 15B

NVIDIA

Free*Free*32K42
DeepSeek R1

DeepSeek

$0.55$2.19128K4520.6
Llama 3 70B

Meta

Free*Free*128K44
Nemotron 3 Nano 30B

NVIDIA

Free*Free*32K42
o1-pro

OpenAI

o1 family · pro

$150.00$600.00200K450.1
Qwen3 235B 2507

Alibaba

Qwen3 235B 2507 family · base

Free*Free*128K47
Grok 3 [Beta]

xAI

Grok 3 family · snapshot

N/AN/A128K48
Llama 4 Maverick

Meta

Free*Free*1M39
DBRX Instruct

Databricks

DBRX family · instruct

Free*Free*32K41
DeepSeek V3.1 (Reasoning)

DeepSeek

DeepSeek V3.1 family · reasoning

Free*Free*128K43
Llama 4 Scout

Meta

Free*Free*10M44
Llama 4 Behemoth

Meta

Free*Free*32K34
Mixtral 8x22B Instruct v0.1

Mistral

Mixtral 8x22B family · instruct

Free*Free*64K36
Gemma 3 27B

Google

Free*Free*32K35
Nova Pro

Amazon

N/AN/A128K33
GLM-4.5-Air

Zhipu AI

N/AN/A128K38
GLM-4.5

Zhipu AI

N/AN/A128K40
DeepSeek V3.1

DeepSeek

DeepSeek V3.1 family · base

Free*Free*128K41
GPT-OSS 20B

OpenAI

GPT-OSS family · mini

Free*Free*128K36
Mistral 7B v0.3

Mistral

Mistral 7B family · snapshot

Free*Free*32K29
Mistral 8x7B v0.2

Mistral

Mistral 8x7B family · snapshot

Free*Free*32K27
1-bit Bonsai 1.7B

Prism ML

1-bit Bonsai family · 1-7b

Free*Free*32K
1-bit Bonsai 4B

Prism ML

1-bit Bonsai family · 4b

Free*Free*32K
1-bit Bonsai 8B

Prism ML

1-bit Bonsai family · 8b

Free*Free*64K
Aion-2.0

Aion Labs

$0.80$1.60128K
Composer 2

Cursor

Composer family · base

$0.50$2.50200K
DeepSeek R1 Distill Qwen 32B

DeepSeek

DeepSeek R1 Distill family · qwen-32b

Free*Free*128K
Gemma 4 E2B

Google

Gemma 4 family · e2b

Free*Free*128K
Gemma 4 E4B

Google

Gemma 4 family · e4b

Free*Free*128K
GLM-4.7-Flash

Zhipu AI

Free*Free*200K
GLM-5-Turbo

Z.AI

GLM-5 family · turbo

$1.20$4.00200K
GLM-5V-Turbo

Z.AI

GLM-5 family · vision-turbo

$1.20$4.00200K
GPT-5 mini

OpenAI

GPT-5 family · mini

N/AN/A128K
GPT-5 nano

OpenAI

GPT-5 family · nano

$0.05$0.40400K
GPT-5.2 Instant

OpenAI

GPT-5.2 family · instant

$1.50$6.00128K
GPT-5.2 Pro

OpenAI

GPT-5.2 family · pro

$25.00$150.00400K
GPT-5.3 Instant

OpenAI

GPT-5.3 family · instant

$1.75$14.00128K
GPT-5.3-Codex-Spark

OpenAI

GPT-5.3 Codex family · spark

$2.00$8.00256K
Granite-4.0-1B

IBM

Granite 4.0 1B family · dense

Free*Free*128K
Granite-4.0-350M

IBM

Granite 4.0 350M family · dense

Free*Free*32K
Granite-4.0-H-1B

IBM

Granite 4.0 1B family · hybrid

Free*Free*128K
Granite-4.0-H-350M

IBM

Granite 4.0 350M family · hybrid

Free*Free*32K
Grok 3 Mini

xAI

$0.30$0.50128K
Grok 4.20

xAI

Grok 4.20 family · reasoning

$2.00$6.002M
Grok 4.20 Multi-agent

xAI

Grok 4.20 family · multi-agent

$2.00$6.002M
Holo3-122B-A10B

H Company

Holo3 family · 122b-a10b

$0.40$3.0064K
Holo3-35B-A3B

H Company

Holo3 family · 35b-a3b

$0.25$1.8064K
Leanstral

Mistral

Free*Free*256K
LFM2-24B-A2B

LiquidAI

$0.03$0.1232K
LFM2.5-1.2B-Instruct

LiquidAI

LFM2.5 1.2B family · instruct

Free*Free*32K
LFM2.5-1.2B-Thinking

LiquidAI

LFM2.5 1.2B family · reasoning

Free*Free*32K
LFM2.5-350M

LiquidAI

Free*Free*32K
Mercury 2

Inception

$0.25$0.75128K
MiniMax M1 80k

MiniMax

N/AN/A80K
MiniMax M2.5

MiniMax

$0.30$1.20128K
MiniMax M2.7

MiniMax

$0.30$1.20200K
Ministral 3 14B

Mistral

Ministral 3 14B family · base

Free*Free*128K
Ministral 3 14B (Reasoning)

Mistral

Ministral 3 14B family · reasoning

Free*Free*128K
Ministral 3 3B

Mistral

Ministral 3 3B family · base

Free*Free*128K
Ministral 3 3B (Reasoning)

Mistral

Ministral 3 3B family · reasoning

Free*Free*128K
Ministral 3 8B

Mistral

Ministral 3 8B family · base

Free*Free*128K
Ministral 3 8B (Reasoning)

Mistral

Ministral 3 8B family · reasoning

Free*Free*128K
Mistral Medium 3

Mistral

$0.40$2.00128K
Mistral Small 4

Mistral

Mistral Small 4 family · base

Free*Free*256K
Mistral Small 4 (Reasoning)

Mistral

Mistral Small 4 family · reasoning

Free*Free*256K
Nemotron 3 Super 120B A12B

NVIDIA

Free*Free*256K
o4-mini

OpenAI

$1.10$4.40200K
Qwen2.5 Coder 32B Instruct

Alibaba

Qwen2.5 Coder family · 32b-instruct

Free*Free*128K
Qwen2.5-VL-32B

Alibaba

Free*Free*32K
Qwen3.5 Flash

Alibaba

Qwen3.5 Flash family · base

N/AN/A1M
Qwen3.5 Plus

Alibaba

Qwen3.5 Plus family · base

N/AN/A1M
Seed 1.6

ByteDance

Seed 1.6 family · base

$0.25$2.00256K
Seed 1.6 Flash

ByteDance

Seed 1.6 family · flash

$0.08$0.30256K
Seed-2.0-Lite

ByteDance

Seed 2.0 family · lite

$0.25$2.00256K
Seed-2.0-Mini

ByteDance

Seed 2.0 family · mini

$0.10$0.40256K
Step 3.5 Flash

StepFun

$0.10$0.30256K
Trinity-Large-Thinking

Arcee AI

Trinity Large family · thinking

$0.25$0.90512K

* Score/$ = Overall benchmark score ÷ output price per million tokens. Higher is better. Free/open-weight models are excluded.

Frequently Asked Questions

Which LLM has the best price-to-performance ratio?

Based on our benchmarks, models like DeepSeek V3, Gemini 3 Flash, and Gemini 3.1 Flash-Lite offer strong price-to-performance ratios. DeepSeek V3 scores competitively on benchmarks while costing a fraction of proprietary frontier models, and Google's Flash tiers are positioned for lower-cost high-volume usage. For free self-hosted options, Meta's Llama 4 models provide excellent performance at zero API cost.

How much more expensive is GPT-5.4 Pro than GPT-5.4?

GPT-5.4 is $2.50 input and $15.00 output per million tokens. GPT-5.4 Pro is $30.00 input and $180.00 output. That makes Pro 12x the cost on output tokens, so it only makes sense if the accuracy bump is worth a very real increase in spend.

Are open-source LLMs really free?

Open-weight models like Llama 4, DeepSeek, and Qwen are free to download, but running them requires GPU infrastructure. Self-hosting costs vary from $0.50-$5.00/hour for cloud GPUs depending on model size. Many providers also offer API access to open models at lower prices than proprietary alternatives.

What is the cheapest LLM API in 2026?

Gemini 3.1 Flash-Lite is among the cheapest proprietary APIs at $0.10/$0.40 per million tokens (input/output). Gemini 2.5 Flash remains very affordable at $0.15/$0.60, Gemini 3 Flash currently sits at $0.50/$3.00, and DeepSeek V3 is also inexpensive at $0.27/$1.10. For reasoning models, o3-mini and o4-mini offer budget-friendly options at $1.10/$4.40.

Weekly LLM Benchmark Digest

Get notified when new models drop, benchmark scores change, or the leaderboard shifts. One email per week.

Free. No spam. Unsubscribe anytime. We only store derived location metadata for consent routing.