Skip to main content
Skip to main content

LLM API Pricing Comparison 2026

Compare API pricing for every major LLM side by side. All prices are per million tokens. The Score/$column uses BenchLM's provisional overall score per dollar of output — higher means better value.

Last updated: April 21, 2026. Prices reflect official API rates. Open-weight model costs depend on your infrastructure. How does LLM token pricing work?

Cheapest API

LFM2-24B-A2B

$0.03 in / $0.12 out per 1M tokens

Best Value (Score/$)

Gemini 3.1 Flash-Lite

Score/$ ratio: 100.0

Highest Scored (Paid)

Claude Mythos Preview

Prov. score: 99 — $25 in / $125 out

Model the full break-even →
ModelSelf-host est.
Claude Mythos Preview

Anthropic

Claude Mythos family · preview

$25.00$125.001M990.8\u2014
Claude Opus 4.7

Anthropic

Claude Opus 4.7 family · base

$5.00$25.001M973.7\u2014
GPT-5.3 Codex

OpenAI

$2.50$10.00400K~899.2\u2014
GPT-5.4

OpenAI

GPT-5.4 family · base

$2.50$15.001.05M936.1\u2014
Gemini 3.1 Pro

Google

$1.25$5.001M9318.2\u2014
GPT-5.1-Codex-Max

OpenAI

$2.00$8.00400K~7811.4\u2014
GPT-5.2-Codex

OpenAI

$2.00$8.00400K~8011.4\u2014
GPT-5.4 Pro

OpenAI

GPT-5.4 family · pro

$30.00$180.001.05M920.5\u2014
Grok 4.1

xAI

$3.00$15.001M~806.0\u2014
Claude Opus 4.6

Anthropic

$5.00$25.001M913.5\u2014
Gemini 3 Pro Deep Think

Google

Gemini 3 Pro family · reasoning

N/AN/A2M~86\u2014
GPT-5 (medium)

OpenAI

GPT-5 family · reasoning

N/AN/A128K~73\u2014
GPT-5 (high)

OpenAI

GPT-5 family · reasoning

N/AN/A128K~79\u2014
o1-preview

OpenAI

o1 family · snapshot

N/AN/A200K~68\u2014
GPT-5.1

OpenAI

$1.50$6.00200K~8013.8\u2014
Kimi 2.6

Moonshot AI

Kimi 2.6 family · base

Free*Free*256K83\u2014
Claude Sonnet 4.6

Anthropic

$3.00$15.00200K855.5\u2014
GLM-5 (Reasoning)

Z.AI

GLM-5 family · reasoning

Free*Free*200K~84\u2014
Grok 4.1 Fast

xAI

N/AN/A1M~72\u2014
GPT-5.2

OpenAI

GPT-5.2 family · thinking

$2.00$8.00400K8310.0\u2014
Qwen3.5 397B (Reasoning)

Alibaba

Qwen3.5 397B family · reasoning

Free*Free*128K~80\u2014
Claude Sonnet 4.5

Anthropic

$3.00$15.00200K~685.1\u2014
Gemini 3 Pro

Google

Gemini 3 Pro family · base

N/AN/A2M~83\u2014
GLM-5.1

Z.AI

GLM-5 family · flagship

$1.40$4.40203K8417.3$18,221/mo
Claude Opus 4.5

Anthropic

N/AN/A200K80\u2014
Kimi K2.5 (Reasoning)

Moonshot AI

Kimi K2.5 family · reasoning

N/AN/A128K~79\u2014
o3-mini

OpenAI

o3 family · mini

$1.10$4.40200K~5816.1\u2014
o3-pro

OpenAI

o3 family · pro

N/AN/A200K~59\u2014
o3

OpenAI

o3 family · base

$10.00$40.00200K~591.7\u2014
Qwen3.5-122B-A10B

Alibaba

Free*Free*262K68\u2014
Qwen3.6 Plus

Alibaba

Free*Free*1M77\u2014
GLM-4.7

Z.AI

Free*Free*200K~71\u2014
GLM-5

Z.AI

GLM-5 family · base

Free*Free*200K77\u2014
MiMo-V2-Flash

Xiaomi

MiMo-V2-Flash family · base

Free*Free*256K~62\u2014
Grok 4

xAI

N/AN/A128K~67\u2014
o1

OpenAI

o1 family · base

$15.00$60.00200K~591.1\u2014
GPT-4.1

OpenAI

GPT-4.1 family · base

$2.00$8.001M~608.1\u2014
Qwen3.5-27B

Alibaba

Free*Free*262K65\u2014
Gemini 2.5 Pro

Google

$1.25$5.001M~6712.8\u2014
Grok 4.20

xAI

Grok 4.20 family · reasoning

$2.00$6.002M7710.7\u2014
Kimi K2.5

Moonshot AI

$0.50$2.80256K6822.9$5,221/mo
Qwen2.5-1M

Alibaba

Free*Free*1M~53\u2014
Qwen3.5 397B

Alibaba

Free*Free*128K66\u2014
DeepSeek Coder 2.0

DeepSeek

$0.27$1.10128K~5357.3\u2014
DeepSeekMath V2

DeepSeek

DeepSeekMath family · snapshot

Free*Free*128K~52\u2014
GPT-5.4 mini

OpenAI

GPT-5.4 family · mini

$0.75$4.50400K7314.0\u2014
Claude 4 Sonnet

Anthropic

N/AN/A200K~52\u2014
Claude 4.1 Opus

Anthropic

Claude 4.1 Opus family · base

N/AN/A200K~53\u2014
DeepSeek V3.2 (Thinking)

DeepSeek

DeepSeek V3.2 family · reasoning

Free*Free*128K~65\u2014
Qwen3.5-35B-A3B

Alibaba

Qwen3.5-35B-A3B family · base

Free*Free*262K59\u2014
Nemotron 3 Ultra 500B

NVIDIA

Free*Free*10M~48\u2014
Gemma 4 26B A4B

Google

Gemma 4 family · 26b-a4b

Free*Free*256K~58\u2014
Gemma 4 31B

Google

Gemma 4 family · 31b

Free*Free*256K~66$429/mo
Qwen2.5-72B

Alibaba

Free*Free*128K~52\u2014
Kimi K2

Moonshot AI

N/AN/A128K~43\u2014
Claude Haiku 4.5

Anthropic

$1.00$5.00200K~5911.2\u2014
DeepSeek LLM 2.0

DeepSeek

Free*Free*128K~53\u2014
DeepSeek V3.2

DeepSeek

DeepSeek V3.2 family · base

Free*Free*128K~60\u2014
Gemini 3 Flash

Google

$0.50$3.001M~6718.7\u2014
MiniMax M2.7

MiniMax

$0.30$1.20200K6445.0\u2014
o4-mini (high)

OpenAI

o4-mini family · reasoning

N/AN/A200K~46\u2014
Claude 3.5 Sonnet

Anthropic

N/AN/A200K~42\u2014
GPT-4.1 mini

OpenAI

GPT-4.1 family · mini

$0.40$1.601M~4732.5\u2014
Nemotron 3 Super 100B

NVIDIA

Free*Free*1M~46\u2014
Grok Code Fast 1

xAI

N/AN/A256K~42\u2014
Llama 3.1 405B

Meta

Free*Free*128K~43\u2014
Mistral Large 3

Mistral

$2.00$6.00128K~528.3$9,110/mo
Claude 4.1 Opus Thinking

Anthropic

Claude 4.1 Opus family · reasoning

N/AN/A200K~45\u2014
GPT-4o mini

OpenAI

GPT-4o family · mini

$0.15$0.60128K~4581.7\u2014
Mistral Large 2

Mistral

N/AN/A128K~40\u2014
Sarvam 105B

Sarvam

Sarvam 105B family · base

Free*Free*128K~41\u2014
GPT-4o

OpenAI

GPT-4o family · base

$2.50$10.00128K~414.3\u2014
DeepSeek V3

DeepSeek

DeepSeek family · snapshot

$0.27$1.10128K~3737.3$18,221/mo
Gemini 3.1 Flash-Lite

Google

$0.10$0.401M~51100.0\u2014
Gemini 1.5 Pro

Google

N/AN/A2M~37\u2014
Phi-4

Microsoft

Free*Free*16K~29\u2014
Qwen3 235B 2507 (Reasoning)

Alibaba

Qwen3 235B 2507 family · reasoning

Free*Free*128K~48\u2014
Claude 3 Opus

Anthropic

N/AN/A200K~36\u2014
Gemini 2.5 Flash

Google

$0.15$0.601M~4056.7\u2014
DBRX Instruct

Databricks

DBRX family · instruct

Free*Free*32K~33\u2014
GPT-4.1 nano

OpenAI

GPT-4.1 family · nano

$0.10$0.401M~2882.5\u2014
Claude 3 Haiku

Anthropic

N/AN/A200K~24\u2014
Gemini 1.0 Pro

Google

N/AN/A32K~25\u2014
GPT-4 Turbo

OpenAI

N/AN/A128K~27\u2014
GPT-OSS 120B

OpenAI

GPT-OSS family · base

Free*Free*128K~38\u2014
Mistral 8x7B

Mistral

Mistral 8x7B family · base

Free*Free*32K~25\u2014
Nemotron Ultra 253B

NVIDIA

Free*Free*32K~23\u2014
Moonshot v1

Moonshot AI

Moonshot family · snapshot

N/AN/A128K~24\u2014
Z-1

Z

N/AN/A128K~25\u2014
o1-pro

OpenAI

o1 family · pro

$150.00$600.00200K~300.1\u2014
DeepSeek R1

DeepSeek

$0.55$2.19128K~3511.9$18,221/mo
Mixtral 8x22B Instruct v0.1

Mistral

Mixtral 8x22B family · instruct

Free*Free*64K~24\u2014
Nemotron-4 15B

NVIDIA

Free*Free*32K~24\u2014
Llama 3 70B

Meta

Free*Free*128K~28\u2014
Nemotron 3 Nano 30B

NVIDIA

Free*Free*32K~27\u2014
Qwen3 235B 2507

Alibaba

Qwen3 235B 2507 family · base

Free*Free*128K~35\u2014
Grok 3 [Beta]

xAI

Grok 3 family · snapshot

N/AN/A128K~34\u2014
Llama 4 Maverick

Meta

Free*Free*1M~18$2,610/mo
Llama 4 Scout

Meta

Free*Free*10M~24$2,278/mo
DeepSeek V3.1 (Reasoning)

DeepSeek

DeepSeek V3.1 family · reasoning

Free*Free*128K~32\u2014
Llama 4 Behemoth

Meta

Free*Free*32K~12\u2014
Nova Pro

Amazon

N/AN/A128K~11\u2014
Gemma 3 27B

Google

Free*Free*32K~18\u2014
GLM-4.5

Z.AI

N/AN/A128K~29\u2014
GPT-OSS 20B

OpenAI

GPT-OSS family · mini

Free*Free*128K~19\u2014
DeepSeek V3.1

DeepSeek

DeepSeek V3.1 family · base

Free*Free*128K~28\u2014
GLM-4.5-Air

Z.AI

N/AN/A128K~21\u2014
Mistral 7B v0.3

Mistral

Mistral 7B family · snapshot

Free*Free*32K~5\u2014
Mistral 8x7B v0.2

Mistral

Mistral 8x7B family · snapshot

Free*Free*32K~2\u2014
1-bit Bonsai 1.7B

Prism ML

1-bit Bonsai family · 1-7b

Free*Free*32K\u2014
1-bit Bonsai 4B

Prism ML

1-bit Bonsai family · 4b

Free*Free*32K\u2014
1-bit Bonsai 8B

Prism ML

1-bit Bonsai family · 8b

Free*Free*64K\u2014
Aion-2.0

Aion Labs

$0.80$1.60128K\u2014
Composer 2

Cursor

Composer family · base

$0.50$2.50200K\u2014
DeepSeek R1 Distill Qwen 32B

DeepSeek

DeepSeek R1 Distill family · qwen-32b

Free*Free*128K\u2014
Gemma 4 E2B

Google

Gemma 4 family · e2b

Free*Free*128K\u2014
Gemma 4 E4B

Google

Gemma 4 family · e4b

Free*Free*128K\u2014
GLM-4.7-Flash

Z.AI

Free*Free*200K\u2014
GLM-5-Turbo

Z.AI

GLM-5 family · turbo

$1.20$4.00200K\u2014
GLM-5V-Turbo

Z.AI

GLM-5 family · vision-turbo

$1.20$4.00200K\u2014
GPT-5 mini

OpenAI

GPT-5 family · mini

N/AN/A128K\u2014
GPT-5 nano

OpenAI

GPT-5 family · nano

$0.05$0.40400K\u2014
GPT-5.2 Instant

OpenAI

GPT-5.2 family · instant

$1.50$6.00128K\u2014
GPT-5.2 Pro

OpenAI

GPT-5.2 family · pro

$25.00$150.00400K\u2014
GPT-5.3 Instant

OpenAI

GPT-5.3 family · instant

$1.75$14.00128K\u2014
GPT-5.3-Codex-Spark

OpenAI

GPT-5.3 Codex family · spark

$2.00$8.00256K\u2014
GPT-5.4 nano

OpenAI

GPT-5.4 family · nano

$0.20$1.25400K\u2014
Granite-4.0-1B

IBM

Granite 4.0 1B family · dense

Free*Free*128K\u2014
Granite-4.0-350M

IBM

Granite 4.0 350M family · dense

Free*Free*32K\u2014
Granite-4.0-H-1B

IBM

Granite 4.0 1B family · hybrid

Free*Free*128K\u2014
Granite-4.0-H-350M

IBM

Granite 4.0 350M family · hybrid

Free*Free*32K\u2014
Grok 3 Mini

xAI

$0.30$0.50128K\u2014
Grok 4.20 Multi-agent

xAI

Grok 4.20 family · multi-agent

$2.00$6.002M\u2014
Holo3-122B-A10B

H Company

Holo3 family · 122b-a10b

$0.40$3.0064K\u2014
Holo3-35B-A3B

H Company

Holo3 family · 35b-a3b

$0.25$1.8064K\u2014
Leanstral

Mistral

Free*Free*256K\u2014
LFM2-24B-A2B

LiquidAI

$0.03$0.1232K\u2014
LFM2.5-1.2B-Instruct

LiquidAI

LFM2.5 1.2B family · instruct

Free*Free*32K\u2014
LFM2.5-1.2B-Thinking

LiquidAI

LFM2.5 1.2B family · reasoning

Free*Free*32K\u2014
LFM2.5-350M

LiquidAI

Free*Free*32K\u2014
LFM2.5-VL-450M

LiquidAI

Free*Free*128K\u2014
Mercury 2

Inception

$0.25$0.75128K\u2014
MiniMax M1 80k

MiniMax

N/AN/A80K\u2014
MiniMax M2.5

MiniMax

$0.30$1.20128K\u2014
Ministral 3 14B

Mistral

Ministral 3 14B family · base

Free*Free*128K\u2014
Ministral 3 14B (Reasoning)

Mistral

Ministral 3 14B family · reasoning

Free*Free*128K\u2014
Ministral 3 3B

Mistral

Ministral 3 3B family · base

Free*Free*128K\u2014
Ministral 3 3B (Reasoning)

Mistral

Ministral 3 3B family · reasoning

Free*Free*128K\u2014
Ministral 3 8B

Mistral

Ministral 3 8B family · base

Free*Free*128K\u2014
Ministral 3 8B (Reasoning)

Mistral

Ministral 3 8B family · reasoning

Free*Free*128K\u2014
Mistral Medium 3

Mistral

$0.40$2.00128K\u2014
Mistral Small 4

Mistral

Mistral Small 4 family · base

Free*Free*256K$2,278/mo
Mistral Small 4 (Reasoning)

Mistral

Mistral Small 4 family · reasoning

Free*Free*256K\u2014
Nemotron 3 Super 120B A12B

NVIDIA

Free*Free*256K\u2014
o4-mini

OpenAI

$1.10$4.40200K\u2014
Qwen2.5 Coder 32B Instruct

Alibaba

Qwen2.5 Coder family · 32b-instruct

Free*Free*128K\u2014
Qwen2.5-VL-32B

Alibaba

Free*Free*32K\u2014
Qwen3.5 Flash

Alibaba

Qwen3.5 Flash family · base

N/AN/A1M\u2014
Qwen3.5 Plus

Alibaba

Qwen3.5 Plus family · base

N/AN/A1M\u2014
Sarvam 30B

Sarvam

Sarvam 30B family · base

Free*Free*64K\u2014
Seed 1.6

ByteDance

Seed 1.6 family · base

$0.25$2.00256K\u2014
Seed 1.6 Flash

ByteDance

Seed 1.6 family · flash

$0.08$0.30256K\u2014
Seed-2.0-Lite

ByteDance

Seed 2.0 family · lite

$0.25$2.00256K\u2014
Seed-2.0-Mini

ByteDance

Seed 2.0 family · mini

$0.10$0.40256K\u2014
Step 3.5 Flash

StepFun

$0.10$0.30256K\u2014
Ternary Bonsai 1.7B

Prism ML

Ternary Bonsai family · 1-7b

Free*Free*32K\u2014
Ternary Bonsai 4B

Prism ML

Ternary Bonsai family · 4b

Free*Free*32K\u2014
Ternary Bonsai 8B

Prism ML

Ternary Bonsai family · 8b

Free*Free*64K\u2014
Trinity-Large-Thinking

Arcee AI

Trinity Large family · thinking

$0.25$0.90512K\u2014

* Score/$ = Overall benchmark score ÷ output price per million tokens. Higher is better. Free/open-weight models are excluded.

Frequently Asked Questions

Which LLM has the best price-to-performance ratio?

Based on our benchmarks, models like DeepSeek V3, Gemini 3 Flash, and Gemini 3.1 Flash-Lite offer strong price-to-performance ratios. DeepSeek V3 scores competitively on benchmarks while costing a fraction of proprietary frontier models, and Google's Flash tiers are positioned for lower-cost high-volume usage. For free self-hosted options, Meta's Llama 4 models provide excellent performance at zero API cost.

How much more expensive is GPT-5.4 Pro than GPT-5.4?

GPT-5.4 is $2.50 input and $15.00 output per million tokens. GPT-5.4 Pro is $30.00 input and $180.00 output. That makes Pro 12x the cost on output tokens, so it only makes sense if the accuracy bump is worth a very real increase in spend.

Are open-source LLMs really free?

Open-weight models like Llama 4, DeepSeek, and Qwen are free to download, but running them requires GPU infrastructure. Self-hosting costs vary from $0.50-$5.00/hour for cloud GPUs depending on model size. Many providers also offer API access to open models at lower prices than proprietary alternatives.

What is the cheapest LLM API in 2026?

Gemini 3.1 Flash-Lite is among the cheapest proprietary APIs at $0.10/$0.40 per million tokens (input/output). Gemini 2.5 Flash remains very affordable at $0.15/$0.60, Gemini 3 Flash currently sits at $0.50/$3.00, and DeepSeek V3 is also inexpensive at $0.27/$1.10. For reasoning models, o3-mini and o4-mini offer budget-friendly options at $1.10/$4.40.

The AI models change fast. We track them for you.

For engineers, researchers, and the plain curious — a weekly brief on new models, ranking shifts, and pricing changes.

Free. No spam. Unsubscribe anytime.