LLM API Pricing Comparison 2026
Compare API pricing for every major LLM side by side. All prices are per million tokens. The Score/$column uses BenchLM's provisional overall score per dollar of output — higher means better value.
Last updated: April 21, 2026. Prices reflect official API rates. Open-weight model costs depend on your infrastructure. How does LLM token pricing work?
LFM2-24B-A2B
$0.03 in / $0.12 out per 1M tokens
Gemini 3.1 Flash-Lite
Score/$ ratio: 100.0
Claude Mythos Preview
Prov. score: 99 — $25 in / $125 out
| Model | Self-host est. | |||||
|---|---|---|---|---|---|---|
| Claude Mythos Preview Anthropic Claude Mythos family · preview | $25.00 | $125.00 | 1M | 99 | 0.8 | \u2014 |
| Claude Opus 4.7 Anthropic Claude Opus 4.7 family · base | $5.00 | $25.00 | 1M | 97 | 3.7 | \u2014 |
| GPT-5.3 Codex OpenAI | $2.50 | $10.00 | 400K | ~89 | 9.2 | \u2014 |
| GPT-5.4 OpenAI GPT-5.4 family · base | $2.50 | $15.00 | 1.05M | 93 | 6.1 | \u2014 |
| Gemini 3.1 Pro | $1.25 | $5.00 | 1M | 93 | 18.2 | \u2014 |
| GPT-5.1-Codex-Max OpenAI | $2.00 | $8.00 | 400K | ~78 | 11.4 | \u2014 |
| GPT-5.2-Codex OpenAI | $2.00 | $8.00 | 400K | ~80 | 11.4 | \u2014 |
| GPT-5.4 Pro OpenAI GPT-5.4 family · pro | $30.00 | $180.00 | 1.05M | 92 | 0.5 | \u2014 |
| Grok 4.1 xAI | $3.00 | $15.00 | 1M | ~80 | 6.0 | \u2014 |
| Claude Opus 4.6 Anthropic | $5.00 | $25.00 | 1M | 91 | 3.5 | \u2014 |
| Gemini 3 Pro Deep Think Gemini 3 Pro family · reasoning | N/A | N/A | 2M | ~86 | — | \u2014 |
| GPT-5 (medium) OpenAI GPT-5 family · reasoning | N/A | N/A | 128K | ~73 | — | \u2014 |
| GPT-5 (high) OpenAI GPT-5 family · reasoning | N/A | N/A | 128K | ~79 | — | \u2014 |
| o1-preview OpenAI o1 family · snapshot | N/A | N/A | 200K | ~68 | — | \u2014 |
| GPT-5.1 OpenAI | $1.50 | $6.00 | 200K | ~80 | 13.8 | \u2014 |
| Kimi 2.6 Moonshot AI Kimi 2.6 family · base | Free* | Free* | 256K | 83 | — | \u2014 |
| Claude Sonnet 4.6 Anthropic | $3.00 | $15.00 | 200K | 85 | 5.5 | \u2014 |
| GLM-5 (Reasoning) Z.AI GLM-5 family · reasoning | Free* | Free* | 200K | ~84 | — | \u2014 |
| Grok 4.1 Fast xAI | N/A | N/A | 1M | ~72 | — | \u2014 |
| GPT-5.2 OpenAI GPT-5.2 family · thinking | $2.00 | $8.00 | 400K | 83 | 10.0 | \u2014 |
| Qwen3.5 397B (Reasoning) Alibaba Qwen3.5 397B family · reasoning | Free* | Free* | 128K | ~80 | — | \u2014 |
| Claude Sonnet 4.5 Anthropic | $3.00 | $15.00 | 200K | ~68 | 5.1 | \u2014 |
| Gemini 3 Pro Gemini 3 Pro family · base | N/A | N/A | 2M | ~83 | — | \u2014 |
| GLM-5.1 Z.AI GLM-5 family · flagship | $1.40 | $4.40 | 203K | 84 | 17.3 | $18,221/mo |
| Claude Opus 4.5 Anthropic | N/A | N/A | 200K | 80 | — | \u2014 |
| Kimi K2.5 (Reasoning) Moonshot AI Kimi K2.5 family · reasoning | N/A | N/A | 128K | ~79 | — | \u2014 |
| o3-mini OpenAI o3 family · mini | $1.10 | $4.40 | 200K | ~58 | 16.1 | \u2014 |
| o3-pro OpenAI o3 family · pro | N/A | N/A | 200K | ~59 | — | \u2014 |
| o3 OpenAI o3 family · base | $10.00 | $40.00 | 200K | ~59 | 1.7 | \u2014 |
| Qwen3.5-122B-A10B Alibaba | Free* | Free* | 262K | 68 | — | \u2014 |
| Qwen3.6 Plus Alibaba | Free* | Free* | 1M | 77 | — | \u2014 |
| GLM-4.7 Z.AI | Free* | Free* | 200K | ~71 | — | \u2014 |
| GLM-5 Z.AI GLM-5 family · base | Free* | Free* | 200K | 77 | — | \u2014 |
| MiMo-V2-Flash Xiaomi MiMo-V2-Flash family · base | Free* | Free* | 256K | ~62 | — | \u2014 |
| Grok 4 xAI | N/A | N/A | 128K | ~67 | — | \u2014 |
| o1 OpenAI o1 family · base | $15.00 | $60.00 | 200K | ~59 | 1.1 | \u2014 |
| GPT-4.1 OpenAI GPT-4.1 family · base | $2.00 | $8.00 | 1M | ~60 | 8.1 | \u2014 |
| Qwen3.5-27B Alibaba | Free* | Free* | 262K | 65 | — | \u2014 |
| Gemini 2.5 Pro | $1.25 | $5.00 | 1M | ~67 | 12.8 | \u2014 |
| Grok 4.20 xAI Grok 4.20 family · reasoning | $2.00 | $6.00 | 2M | 77 | 10.7 | \u2014 |
| Kimi K2.5 Moonshot AI | $0.50 | $2.80 | 256K | 68 | 22.9 | $5,221/mo |
| Qwen2.5-1M Alibaba | Free* | Free* | 1M | ~53 | — | \u2014 |
| Qwen3.5 397B Alibaba | Free* | Free* | 128K | 66 | — | \u2014 |
| DeepSeek Coder 2.0 DeepSeek | $0.27 | $1.10 | 128K | ~53 | 57.3 | \u2014 |
| DeepSeekMath V2 DeepSeek DeepSeekMath family · snapshot | Free* | Free* | 128K | ~52 | — | \u2014 |
| GPT-5.4 mini OpenAI GPT-5.4 family · mini | $0.75 | $4.50 | 400K | 73 | 14.0 | \u2014 |
| Claude 4 Sonnet Anthropic | N/A | N/A | 200K | ~52 | — | \u2014 |
| Claude 4.1 Opus Anthropic Claude 4.1 Opus family · base | N/A | N/A | 200K | ~53 | — | \u2014 |
| DeepSeek V3.2 (Thinking) DeepSeek DeepSeek V3.2 family · reasoning | Free* | Free* | 128K | ~65 | — | \u2014 |
| Qwen3.5-35B-A3B Alibaba Qwen3.5-35B-A3B family · base | Free* | Free* | 262K | 59 | — | \u2014 |
| Nemotron 3 Ultra 500B NVIDIA | Free* | Free* | 10M | ~48 | — | \u2014 |
| Gemma 4 26B A4B Gemma 4 family · 26b-a4b | Free* | Free* | 256K | ~58 | — | \u2014 |
| Gemma 4 31B Gemma 4 family · 31b | Free* | Free* | 256K | ~66 | — | $429/mo |
| Qwen2.5-72B Alibaba | Free* | Free* | 128K | ~52 | — | \u2014 |
| Kimi K2 Moonshot AI | N/A | N/A | 128K | ~43 | — | \u2014 |
| Claude Haiku 4.5 Anthropic | $1.00 | $5.00 | 200K | ~59 | 11.2 | \u2014 |
| DeepSeek LLM 2.0 DeepSeek | Free* | Free* | 128K | ~53 | — | \u2014 |
| DeepSeek V3.2 DeepSeek DeepSeek V3.2 family · base | Free* | Free* | 128K | ~60 | — | \u2014 |
| Gemini 3 Flash | $0.50 | $3.00 | 1M | ~67 | 18.7 | \u2014 |
| MiniMax M2.7 MiniMax | $0.30 | $1.20 | 200K | 64 | 45.0 | \u2014 |
| o4-mini (high) OpenAI o4-mini family · reasoning | N/A | N/A | 200K | ~46 | — | \u2014 |
| Claude 3.5 Sonnet Anthropic | N/A | N/A | 200K | ~42 | — | \u2014 |
| GPT-4.1 mini OpenAI GPT-4.1 family · mini | $0.40 | $1.60 | 1M | ~47 | 32.5 | \u2014 |
| Nemotron 3 Super 100B NVIDIA | Free* | Free* | 1M | ~46 | — | \u2014 |
| Grok Code Fast 1 xAI | N/A | N/A | 256K | ~42 | — | \u2014 |
| Llama 3.1 405B Meta | Free* | Free* | 128K | ~43 | — | \u2014 |
| Mistral Large 3 Mistral | $2.00 | $6.00 | 128K | ~52 | 8.3 | $9,110/mo |
| Claude 4.1 Opus Thinking Anthropic Claude 4.1 Opus family · reasoning | N/A | N/A | 200K | ~45 | — | \u2014 |
| GPT-4o mini OpenAI GPT-4o family · mini | $0.15 | $0.60 | 128K | ~45 | 81.7 | \u2014 |
| Mistral Large 2 Mistral | N/A | N/A | 128K | ~40 | — | \u2014 |
| Sarvam 105B Sarvam Sarvam 105B family · base | Free* | Free* | 128K | ~41 | — | \u2014 |
| GPT-4o OpenAI GPT-4o family · base | $2.50 | $10.00 | 128K | ~41 | 4.3 | \u2014 |
| DeepSeek V3 DeepSeek DeepSeek family · snapshot | $0.27 | $1.10 | 128K | ~37 | 37.3 | $18,221/mo |
| Gemini 3.1 Flash-Lite | $0.10 | $0.40 | 1M | ~51 | 100.0 | \u2014 |
| Gemini 1.5 Pro | N/A | N/A | 2M | ~37 | — | \u2014 |
| Phi-4 Microsoft | Free* | Free* | 16K | ~29 | — | \u2014 |
| Qwen3 235B 2507 (Reasoning) Alibaba Qwen3 235B 2507 family · reasoning | Free* | Free* | 128K | ~48 | — | \u2014 |
| Claude 3 Opus Anthropic | N/A | N/A | 200K | ~36 | — | \u2014 |
| Gemini 2.5 Flash | $0.15 | $0.60 | 1M | ~40 | 56.7 | \u2014 |
| DBRX Instruct Databricks DBRX family · instruct | Free* | Free* | 32K | ~33 | — | \u2014 |
| GPT-4.1 nano OpenAI GPT-4.1 family · nano | $0.10 | $0.40 | 1M | ~28 | 82.5 | \u2014 |
| Claude 3 Haiku Anthropic | N/A | N/A | 200K | ~24 | — | \u2014 |
| Gemini 1.0 Pro | N/A | N/A | 32K | ~25 | — | \u2014 |
| GPT-4 Turbo OpenAI | N/A | N/A | 128K | ~27 | — | \u2014 |
| GPT-OSS 120B OpenAI GPT-OSS family · base | Free* | Free* | 128K | ~38 | — | \u2014 |
| Mistral 8x7B Mistral Mistral 8x7B family · base | Free* | Free* | 32K | ~25 | — | \u2014 |
| Nemotron Ultra 253B NVIDIA | Free* | Free* | 32K | ~23 | — | \u2014 |
| Moonshot v1 Moonshot AI Moonshot family · snapshot | N/A | N/A | 128K | ~24 | — | \u2014 |
| Z-1 Z | N/A | N/A | 128K | ~25 | — | \u2014 |
| o1-pro OpenAI o1 family · pro | $150.00 | $600.00 | 200K | ~30 | 0.1 | \u2014 |
| DeepSeek R1 DeepSeek | $0.55 | $2.19 | 128K | ~35 | 11.9 | $18,221/mo |
| Mixtral 8x22B Instruct v0.1 Mistral Mixtral 8x22B family · instruct | Free* | Free* | 64K | ~24 | — | \u2014 |
| Nemotron-4 15B NVIDIA | Free* | Free* | 32K | ~24 | — | \u2014 |
| Llama 3 70B Meta | Free* | Free* | 128K | ~28 | — | \u2014 |
| Nemotron 3 Nano 30B NVIDIA | Free* | Free* | 32K | ~27 | — | \u2014 |
| Qwen3 235B 2507 Alibaba Qwen3 235B 2507 family · base | Free* | Free* | 128K | ~35 | — | \u2014 |
| Grok 3 [Beta] xAI Grok 3 family · snapshot | N/A | N/A | 128K | ~34 | — | \u2014 |
| Llama 4 Maverick Meta | Free* | Free* | 1M | ~18 | — | $2,610/mo |
| Llama 4 Scout Meta | Free* | Free* | 10M | ~24 | — | $2,278/mo |
| DeepSeek V3.1 (Reasoning) DeepSeek DeepSeek V3.1 family · reasoning | Free* | Free* | 128K | ~32 | — | \u2014 |
| Llama 4 Behemoth Meta | Free* | Free* | 32K | ~12 | — | \u2014 |
| Nova Pro Amazon | N/A | N/A | 128K | ~11 | — | \u2014 |
| Gemma 3 27B | Free* | Free* | 32K | ~18 | — | \u2014 |
| GLM-4.5 Z.AI | N/A | N/A | 128K | ~29 | — | \u2014 |
| GPT-OSS 20B OpenAI GPT-OSS family · mini | Free* | Free* | 128K | ~19 | — | \u2014 |
| DeepSeek V3.1 DeepSeek DeepSeek V3.1 family · base | Free* | Free* | 128K | ~28 | — | \u2014 |
| GLM-4.5-Air Z.AI | N/A | N/A | 128K | ~21 | — | \u2014 |
| Mistral 7B v0.3 Mistral Mistral 7B family · snapshot | Free* | Free* | 32K | ~5 | — | \u2014 |
| Mistral 8x7B v0.2 Mistral Mistral 8x7B family · snapshot | Free* | Free* | 32K | ~2 | — | \u2014 |
| 1-bit Bonsai 1.7B Prism ML 1-bit Bonsai family · 1-7b | Free* | Free* | 32K | — | — | \u2014 |
| 1-bit Bonsai 4B Prism ML 1-bit Bonsai family · 4b | Free* | Free* | 32K | — | — | \u2014 |
| 1-bit Bonsai 8B Prism ML 1-bit Bonsai family · 8b | Free* | Free* | 64K | — | — | \u2014 |
| Aion-2.0 Aion Labs | $0.80 | $1.60 | 128K | — | — | \u2014 |
| Composer 2 Cursor Composer family · base | $0.50 | $2.50 | 200K | — | — | \u2014 |
| DeepSeek R1 Distill Qwen 32B DeepSeek DeepSeek R1 Distill family · qwen-32b | Free* | Free* | 128K | — | — | \u2014 |
| Gemma 4 E2B Gemma 4 family · e2b | Free* | Free* | 128K | — | — | \u2014 |
| Gemma 4 E4B Gemma 4 family · e4b | Free* | Free* | 128K | — | — | \u2014 |
| GLM-4.7-Flash Z.AI | Free* | Free* | 200K | — | — | \u2014 |
| GLM-5-Turbo Z.AI GLM-5 family · turbo | $1.20 | $4.00 | 200K | — | — | \u2014 |
| GLM-5V-Turbo Z.AI GLM-5 family · vision-turbo | $1.20 | $4.00 | 200K | — | — | \u2014 |
| GPT-5 mini OpenAI GPT-5 family · mini | N/A | N/A | 128K | — | — | \u2014 |
| GPT-5 nano OpenAI GPT-5 family · nano | $0.05 | $0.40 | 400K | — | — | \u2014 |
| GPT-5.2 Instant OpenAI GPT-5.2 family · instant | $1.50 | $6.00 | 128K | — | — | \u2014 |
| GPT-5.2 Pro OpenAI GPT-5.2 family · pro | $25.00 | $150.00 | 400K | — | — | \u2014 |
| GPT-5.3 Instant OpenAI GPT-5.3 family · instant | $1.75 | $14.00 | 128K | — | — | \u2014 |
| GPT-5.3-Codex-Spark OpenAI GPT-5.3 Codex family · spark | $2.00 | $8.00 | 256K | — | — | \u2014 |
| GPT-5.4 nano OpenAI GPT-5.4 family · nano | $0.20 | $1.25 | 400K | — | — | \u2014 |
| Granite-4.0-1B IBM Granite 4.0 1B family · dense | Free* | Free* | 128K | — | — | \u2014 |
| Granite-4.0-350M IBM Granite 4.0 350M family · dense | Free* | Free* | 32K | — | — | \u2014 |
| Granite-4.0-H-1B IBM Granite 4.0 1B family · hybrid | Free* | Free* | 128K | — | — | \u2014 |
| Granite-4.0-H-350M IBM Granite 4.0 350M family · hybrid | Free* | Free* | 32K | — | — | \u2014 |
| Grok 3 Mini xAI | $0.30 | $0.50 | 128K | — | — | \u2014 |
| Grok 4.20 Multi-agent xAI Grok 4.20 family · multi-agent | $2.00 | $6.00 | 2M | — | — | \u2014 |
| Holo3-122B-A10B H Company Holo3 family · 122b-a10b | $0.40 | $3.00 | 64K | — | — | \u2014 |
| Holo3-35B-A3B H Company Holo3 family · 35b-a3b | $0.25 | $1.80 | 64K | — | — | \u2014 |
| Leanstral Mistral | Free* | Free* | 256K | — | — | \u2014 |
| LFM2-24B-A2B LiquidAI | $0.03 | $0.12 | 32K | — | — | \u2014 |
| LFM2.5-1.2B-Instruct LiquidAI LFM2.5 1.2B family · instruct | Free* | Free* | 32K | — | — | \u2014 |
| LFM2.5-1.2B-Thinking LiquidAI LFM2.5 1.2B family · reasoning | Free* | Free* | 32K | — | — | \u2014 |
| LFM2.5-350M LiquidAI | Free* | Free* | 32K | — | — | \u2014 |
| LFM2.5-VL-450M LiquidAI | Free* | Free* | 128K | — | — | \u2014 |
| Mercury 2 Inception | $0.25 | $0.75 | 128K | — | — | \u2014 |
| MiniMax M1 80k MiniMax | N/A | N/A | 80K | — | — | \u2014 |
| MiniMax M2.5 MiniMax | $0.30 | $1.20 | 128K | — | — | \u2014 |
| Ministral 3 14B Mistral Ministral 3 14B family · base | Free* | Free* | 128K | — | — | \u2014 |
| Ministral 3 14B (Reasoning) Mistral Ministral 3 14B family · reasoning | Free* | Free* | 128K | — | — | \u2014 |
| Ministral 3 3B Mistral Ministral 3 3B family · base | Free* | Free* | 128K | — | — | \u2014 |
| Ministral 3 3B (Reasoning) Mistral Ministral 3 3B family · reasoning | Free* | Free* | 128K | — | — | \u2014 |
| Ministral 3 8B Mistral Ministral 3 8B family · base | Free* | Free* | 128K | — | — | \u2014 |
| Ministral 3 8B (Reasoning) Mistral Ministral 3 8B family · reasoning | Free* | Free* | 128K | — | — | \u2014 |
| Mistral Medium 3 Mistral | $0.40 | $2.00 | 128K | — | — | \u2014 |
| Mistral Small 4 Mistral Mistral Small 4 family · base | Free* | Free* | 256K | — | — | $2,278/mo |
| Mistral Small 4 (Reasoning) Mistral Mistral Small 4 family · reasoning | Free* | Free* | 256K | — | — | \u2014 |
| Nemotron 3 Super 120B A12B NVIDIA | Free* | Free* | 256K | — | — | \u2014 |
| o4-mini OpenAI | $1.10 | $4.40 | 200K | — | — | \u2014 |
| Qwen2.5 Coder 32B Instruct Alibaba Qwen2.5 Coder family · 32b-instruct | Free* | Free* | 128K | — | — | \u2014 |
| Qwen2.5-VL-32B Alibaba | Free* | Free* | 32K | — | — | \u2014 |
| Qwen3.5 Flash Alibaba Qwen3.5 Flash family · base | N/A | N/A | 1M | — | — | \u2014 |
| Qwen3.5 Plus Alibaba Qwen3.5 Plus family · base | N/A | N/A | 1M | — | — | \u2014 |
| Sarvam 30B Sarvam Sarvam 30B family · base | Free* | Free* | 64K | — | — | \u2014 |
| Seed 1.6 ByteDance Seed 1.6 family · base | $0.25 | $2.00 | 256K | — | — | \u2014 |
| Seed 1.6 Flash ByteDance Seed 1.6 family · flash | $0.08 | $0.30 | 256K | — | — | \u2014 |
| Seed-2.0-Lite ByteDance Seed 2.0 family · lite | $0.25 | $2.00 | 256K | — | — | \u2014 |
| Seed-2.0-Mini ByteDance Seed 2.0 family · mini | $0.10 | $0.40 | 256K | — | — | \u2014 |
| Step 3.5 Flash StepFun | $0.10 | $0.30 | 256K | — | — | \u2014 |
| Ternary Bonsai 1.7B Prism ML Ternary Bonsai family · 1-7b | Free* | Free* | 32K | — | — | \u2014 |
| Ternary Bonsai 4B Prism ML Ternary Bonsai family · 4b | Free* | Free* | 32K | — | — | \u2014 |
| Ternary Bonsai 8B Prism ML Ternary Bonsai family · 8b | Free* | Free* | 64K | — | — | \u2014 |
| Trinity-Large-Thinking Arcee AI Trinity Large family · thinking | $0.25 | $0.90 | 512K | — | — | \u2014 |
* Score/$ = Overall benchmark score ÷ output price per million tokens. Higher is better. Free/open-weight models are excluded.
Interactive scatter chart with efficiency frontier
Estimate cost per blog post, page, or feature
Count tokens for any text across models
Best Value by Category
Frequently Asked Questions
Which LLM has the best price-to-performance ratio?
Based on our benchmarks, models like DeepSeek V3, Gemini 3 Flash, and Gemini 3.1 Flash-Lite offer strong price-to-performance ratios. DeepSeek V3 scores competitively on benchmarks while costing a fraction of proprietary frontier models, and Google's Flash tiers are positioned for lower-cost high-volume usage. For free self-hosted options, Meta's Llama 4 models provide excellent performance at zero API cost.
How much more expensive is GPT-5.4 Pro than GPT-5.4?
GPT-5.4 is $2.50 input and $15.00 output per million tokens. GPT-5.4 Pro is $30.00 input and $180.00 output. That makes Pro 12x the cost on output tokens, so it only makes sense if the accuracy bump is worth a very real increase in spend.
Are open-source LLMs really free?
Open-weight models like Llama 4, DeepSeek, and Qwen are free to download, but running them requires GPU infrastructure. Self-hosting costs vary from $0.50-$5.00/hour for cloud GPUs depending on model size. Many providers also offer API access to open models at lower prices than proprietary alternatives.
What is the cheapest LLM API in 2026?
Gemini 3.1 Flash-Lite is among the cheapest proprietary APIs at $0.10/$0.40 per million tokens (input/output). Gemini 2.5 Flash remains very affordable at $0.15/$0.60, Gemini 3 Flash currently sits at $0.50/$3.00, and DeepSeek V3 is also inexpensive at $0.27/$1.10. For reasoning models, o3-mini and o4-mini offer budget-friendly options at $1.10/$4.40.
The AI models change fast. We track them for you.
For engineers, researchers, and the plain curious — a weekly brief on new models, ranking shifts, and pricing changes.
Free. No spam. Unsubscribe anytime.