Compare API pricing for every major LLM side by side. All prices are per million tokens. The Score/$column shows benchmark performance per dollar of output — higher means better value.
Last updated: April 2, 2026. Prices reflect official API rates. Open-weight model costs depend on your infrastructure. How does LLM token pricing work?
LFM2-24B-A2B
$0.03 in / $0.12 out per 1M tokens
Gemini 3.1 Flash-Lite
Score/$ ratio: 135.0
GPT-5.4 Pro
Score: 91 — $30 in / $180 out
| Model | |||||
|---|---|---|---|---|---|
| GPT-5.4 Pro OpenAI GPT-5.4 family · pro | $30.00 | $180.00 | 1.05M | 92 | 0.5 |
| GPT-5.1-Codex-Max OpenAI | $2.00 | $8.00 | 400K | 81 | 10.6 |
| GPT-5.2-Codex OpenAI | $2.00 | $8.00 | 400K | 82 | 10.5 |
| GPT-5.4 OpenAI GPT-5.4 family · base | $2.50 | $15.00 | 1.05M | 82 | 5.6 |
| Grok 4.1 xAI | $3.00 | $15.00 | 1M | 85 | 5.6 |
| Gemini 3.1 Pro | $1.25 | $5.00 | 1M | 87 | 16.6 |
| GPT-5.3 Codex OpenAI | $2.50 | $10.00 | 400K | 85 | 8.3 |
| Claude Opus 4.6 Anthropic | $15.00 | $75.00 | 1M | 85 | 1.1 |
| Gemini 3 Pro Deep Think Gemini 3 Pro family · reasoning | N/A | N/A | 2M | 80 | — |
| GPT-5 (high) OpenAI GPT-5 family · reasoning | N/A | N/A | 128K | 82 | — |
| GPT-5.2 OpenAI GPT-5.2 family · thinking | $2.00 | $8.00 | 400K | 82 | 9.9 |
| GLM-5 (Reasoning) Zhipu AI GLM-5 family · reasoning | Free* | Free* | 200K | 82 | — |
| GPT-5 (medium) OpenAI GPT-5 family · reasoning | N/A | N/A | 128K | 76 | — |
| GPT-5.1 OpenAI | $1.50 | $6.00 | 200K | 78 | 12.8 |
| Grok 4.1 Fast xAI | N/A | N/A | 1M | 70 | — |
| o1-preview OpenAI o1 family · snapshot | N/A | N/A | 200K | 72 | — |
| Qwen3.5 397B (Reasoning) Alibaba Qwen3.5 397B family · reasoning | Free* | Free* | 128K | 77 | — |
| Gemini 3 Pro Gemini 3 Pro family · base | N/A | N/A | 2M | 79 | — |
| Claude Sonnet 4.6 Anthropic | $3.00 | $15.00 | 200K | 84 | 4.9 |
| Kimi K2.5 (Reasoning) Moonshot AI Kimi K2.5 family · reasoning | N/A | N/A | 128K | 76 | — |
| Claude Opus 4.5 Anthropic | N/A | N/A | 200K | 76 | — |
| Gemma 4 31B Gemma 4 family · 31b | Free* | Free* | 256K | 73 | — |
| Claude Sonnet 4.5 Anthropic | $3.00 | $15.00 | 200K | 68 | 4.7 |
| o3-pro OpenAI o3 family · pro | N/A | N/A | 200K | 67 | — |
| Qwen3.5-122B-A10B Alibaba | Free* | Free* | 262K | 71 | — |
| MiMo-V2-Flash Xiaomi MiMo-V2-Flash family · base | Free* | Free* | 256K | 67 | — |
| o3-mini OpenAI o3 family · mini | $1.10 | $4.40 | 200K | 65 | 15.9 |
| Qwen3.5-27B Alibaba | Free* | Free* | 262K | 70 | — |
| GLM-4.7 Zhipu AI | Free* | Free* | 200K | 74 | — |
| Kimi K2.5 Moonshot AI | $0.50 | $2.80 | 128K | 72 | 24.6 |
| Qwen3.6 Plus Alibaba | Free* | Free* | 1M | 69 | — |
| GLM-5 Zhipu AI GLM-5 family · base | Free* | Free* | 200K | 75 | — |
| o3 OpenAI o3 family · base | $10.00 | $40.00 | 200K | 64 | 1.7 |
| Qwen3.5-35B-A3B Alibaba Qwen3.5-35B-A3B family · base | Free* | Free* | 262K | 66 | — |
| GPT-4.1 OpenAI GPT-4.1 family · base | $2.00 | $8.00 | 1M | 64 | 8.4 |
| Grok 4 xAI | N/A | N/A | 128K | 68 | — |
| o1 OpenAI o1 family · base | $15.00 | $60.00 | 200K | 64 | 1.1 |
| Qwen2.5-1M Alibaba | Free* | Free* | 1M | 62 | — |
| Qwen3.5 397B Alibaba | Free* | Free* | 128K | 68 | — |
| DeepSeek Coder 2.0 DeepSeek | $0.27 | $1.10 | 128K | 62 | 60.0 |
| DeepSeek V3.2 (Thinking) DeepSeek DeepSeek V3.2 family · reasoning | Free* | Free* | 128K | 67 | — |
| DeepSeekMath V2 DeepSeek DeepSeekMath family · snapshot | Free* | Free* | 128K | 63 | — |
| Claude 4 Sonnet Anthropic | N/A | N/A | 200K | 62 | — |
| Gemini 2.5 Pro | $1.25 | $5.00 | 1M | 65 | 13.0 |
| Nemotron 3 Ultra 500B NVIDIA | Free* | Free* | 10M | 60 | — |
| Claude 4.1 Opus Anthropic Claude 4.1 Opus family · base | N/A | N/A | 200K | 62 | — |
| Gemini 3 Flash | $0.50 | $3.00 | 1M | 67 | 21.3 |
| Gemma 4 26B A4B Gemma 4 family · 26b-a4b | Free* | Free* | 256K | 64 | — |
| Qwen2.5-72B Alibaba | Free* | Free* | 128K | 60 | — |
| Claude Haiku 4.5 Anthropic | $0.80 | $4.00 | 200K | 63 | 15.5 |
| DeepSeek LLM 2.0 DeepSeek | Free* | Free* | 128K | 57 | — |
| DeepSeek V3.2 DeepSeek DeepSeek V3.2 family · base | Free* | Free* | 128K | 61 | — |
| o4-mini (high) OpenAI o4-mini family · reasoning | N/A | N/A | 200K | 58 | — |
| Claude 3.5 Sonnet Anthropic | N/A | N/A | 200K | 55 | — |
| GPT-5.4 mini OpenAI GPT-5.4 family · mini | $0.75 | $4.50 | 400K | 66 | 13.3 |
| Grok Code Fast 1 xAI | N/A | N/A | 256K | 56 | — |
| Kimi K2 Moonshot AI | N/A | N/A | 128K | 53 | — |
| Mistral Large 3 Mistral | $2.00 | $6.00 | 128K | 58 | 10.0 |
| Nemotron 3 Super 100B NVIDIA | Free* | Free* | 1M | 56 | — |
| GPT-4.1 mini OpenAI GPT-4.1 family · mini | $0.40 | $1.60 | 1M | 57 | 36.9 |
| Llama 3.1 405B Meta | Free* | Free* | 128K | 53 | — |
| Mistral Large 2 Mistral | N/A | N/A | 128K | 52 | — |
| Claude 4.1 Opus Thinking Anthropic Claude 4.1 Opus family · reasoning | N/A | N/A | 200K | 57 | — |
| GPT-4o mini OpenAI GPT-4o family · mini | $0.15 | $0.60 | 128K | 54 | 95.0 |
| GPT-4o OpenAI GPT-4o family · base | $2.50 | $10.00 | 128K | 50 | 5.5 |
| DeepSeek V3 DeepSeek DeepSeek family · snapshot | $0.27 | $1.10 | 128K | 49 | 49.1 |
| Gemini 3.1 Flash-Lite | $0.10 | $0.40 | 1M | 56 | 135.0 |
| Gemini 1.5 Pro | N/A | N/A | 2M | 50 | — |
| Qwen3 235B 2507 (Reasoning) Alibaba Qwen3 235B 2507 family · reasoning | Free* | Free* | 128K | 55 | — |
| GPT-4.1 nano OpenAI GPT-4.1 family · nano | $0.10 | $0.40 | 1M | 44 | 127.5 |
| Claude 3 Opus Anthropic | N/A | N/A | 200K | 49 | — |
| Gemini 2.5 Flash | $0.15 | $0.60 | 1M | 50 | 83.3 |
| Claude 3 Haiku Anthropic | N/A | N/A | 200K | 43 | — |
| GPT-5.4 nano OpenAI GPT-5.4 family · nano | $0.20 | $1.25 | 400K | 58 | 39.2 |
| GPT-OSS 120B OpenAI GPT-OSS family · base | Free* | Free* | 128K | 50 | — |
| Phi-4 Microsoft | Free* | Free* | 16K | 40 | — |
| Mistral 8x7B Mistral Mistral 8x7B family · base | Free* | Free* | 32K | 44 | — |
| Nemotron Ultra 253B NVIDIA | Free* | Free* | 32K | 41 | — |
| GPT-4 Turbo OpenAI | N/A | N/A | 128K | 43 | — |
| Moonshot v1 Moonshot AI Moonshot family · snapshot | N/A | N/A | 128K | 43 | — |
| Z-1 Z | N/A | N/A | 128K | 44 | — |
| Gemini 1.0 Pro | N/A | N/A | 32K | 40 | — |
| Nemotron-4 15B NVIDIA | Free* | Free* | 32K | 42 | — |
| DeepSeek R1 DeepSeek | $0.55 | $2.19 | 128K | 45 | 20.6 |
| Llama 3 70B Meta | Free* | Free* | 128K | 44 | — |
| Nemotron 3 Nano 30B NVIDIA | Free* | Free* | 32K | 42 | — |
| o1-pro OpenAI o1 family · pro | $150.00 | $600.00 | 200K | 45 | 0.1 |
| Qwen3 235B 2507 Alibaba Qwen3 235B 2507 family · base | Free* | Free* | 128K | 47 | — |
| Grok 3 [Beta] xAI Grok 3 family · snapshot | N/A | N/A | 128K | 48 | — |
| Llama 4 Maverick Meta | Free* | Free* | 1M | 39 | — |
| DBRX Instruct Databricks DBRX family · instruct | Free* | Free* | 32K | 41 | — |
| DeepSeek V3.1 (Reasoning) DeepSeek DeepSeek V3.1 family · reasoning | Free* | Free* | 128K | 43 | — |
| Llama 4 Scout Meta | Free* | Free* | 10M | 44 | — |
| Llama 4 Behemoth Meta | Free* | Free* | 32K | 34 | — |
| Mixtral 8x22B Instruct v0.1 Mistral Mixtral 8x22B family · instruct | Free* | Free* | 64K | 36 | — |
| Gemma 3 27B | Free* | Free* | 32K | 35 | — |
| Nova Pro Amazon | N/A | N/A | 128K | 33 | — |
| GLM-4.5-Air Zhipu AI | N/A | N/A | 128K | 38 | — |
| GLM-4.5 Zhipu AI | N/A | N/A | 128K | 40 | — |
| DeepSeek V3.1 DeepSeek DeepSeek V3.1 family · base | Free* | Free* | 128K | 41 | — |
| GPT-OSS 20B OpenAI GPT-OSS family · mini | Free* | Free* | 128K | 36 | — |
| Mistral 7B v0.3 Mistral Mistral 7B family · snapshot | Free* | Free* | 32K | 29 | — |
| Mistral 8x7B v0.2 Mistral Mistral 8x7B family · snapshot | Free* | Free* | 32K | 27 | — |
| 1-bit Bonsai 1.7B Prism ML 1-bit Bonsai family · 1-7b | Free* | Free* | 32K | — | — |
| 1-bit Bonsai 4B Prism ML 1-bit Bonsai family · 4b | Free* | Free* | 32K | — | — |
| 1-bit Bonsai 8B Prism ML 1-bit Bonsai family · 8b | Free* | Free* | 64K | — | — |
| Aion-2.0 Aion Labs | $0.80 | $1.60 | 128K | — | — |
| Composer 2 Cursor Composer family · base | $0.50 | $2.50 | 200K | — | — |
| DeepSeek R1 Distill Qwen 32B DeepSeek DeepSeek R1 Distill family · qwen-32b | Free* | Free* | 128K | — | — |
| Gemma 4 E2B Gemma 4 family · e2b | Free* | Free* | 128K | — | — |
| Gemma 4 E4B Gemma 4 family · e4b | Free* | Free* | 128K | — | — |
| GLM-4.7-Flash Zhipu AI | Free* | Free* | 200K | — | — |
| GLM-5-Turbo Z.AI GLM-5 family · turbo | $1.20 | $4.00 | 200K | — | — |
| GLM-5V-Turbo Z.AI GLM-5 family · vision-turbo | $1.20 | $4.00 | 200K | — | — |
| GPT-5 mini OpenAI GPT-5 family · mini | N/A | N/A | 128K | — | — |
| GPT-5 nano OpenAI GPT-5 family · nano | $0.05 | $0.40 | 400K | — | — |
| GPT-5.2 Instant OpenAI GPT-5.2 family · instant | $1.50 | $6.00 | 128K | — | — |
| GPT-5.2 Pro OpenAI GPT-5.2 family · pro | $25.00 | $150.00 | 400K | — | — |
| GPT-5.3 Instant OpenAI GPT-5.3 family · instant | $1.75 | $14.00 | 128K | — | — |
| GPT-5.3-Codex-Spark OpenAI GPT-5.3 Codex family · spark | $2.00 | $8.00 | 256K | — | — |
| Granite-4.0-1B IBM Granite 4.0 1B family · dense | Free* | Free* | 128K | — | — |
| Granite-4.0-350M IBM Granite 4.0 350M family · dense | Free* | Free* | 32K | — | — |
| Granite-4.0-H-1B IBM Granite 4.0 1B family · hybrid | Free* | Free* | 128K | — | — |
| Granite-4.0-H-350M IBM Granite 4.0 350M family · hybrid | Free* | Free* | 32K | — | — |
| Grok 3 Mini xAI | $0.30 | $0.50 | 128K | — | — |
| Grok 4.20 xAI Grok 4.20 family · reasoning | $2.00 | $6.00 | 2M | — | — |
| Grok 4.20 Multi-agent xAI Grok 4.20 family · multi-agent | $2.00 | $6.00 | 2M | — | — |
| Holo3-122B-A10B H Company Holo3 family · 122b-a10b | $0.40 | $3.00 | 64K | — | — |
| Holo3-35B-A3B H Company Holo3 family · 35b-a3b | $0.25 | $1.80 | 64K | — | — |
| Leanstral Mistral | Free* | Free* | 256K | — | — |
| LFM2-24B-A2B LiquidAI | $0.03 | $0.12 | 32K | — | — |
| LFM2.5-1.2B-Instruct LiquidAI LFM2.5 1.2B family · instruct | Free* | Free* | 32K | — | — |
| LFM2.5-1.2B-Thinking LiquidAI LFM2.5 1.2B family · reasoning | Free* | Free* | 32K | — | — |
| LFM2.5-350M LiquidAI | Free* | Free* | 32K | — | — |
| Mercury 2 Inception | $0.25 | $0.75 | 128K | — | — |
| MiniMax M1 80k MiniMax | N/A | N/A | 80K | — | — |
| MiniMax M2.5 MiniMax | $0.30 | $1.20 | 128K | — | — |
| MiniMax M2.7 MiniMax | $0.30 | $1.20 | 200K | — | — |
| Ministral 3 14B Mistral Ministral 3 14B family · base | Free* | Free* | 128K | — | — |
| Ministral 3 14B (Reasoning) Mistral Ministral 3 14B family · reasoning | Free* | Free* | 128K | — | — |
| Ministral 3 3B Mistral Ministral 3 3B family · base | Free* | Free* | 128K | — | — |
| Ministral 3 3B (Reasoning) Mistral Ministral 3 3B family · reasoning | Free* | Free* | 128K | — | — |
| Ministral 3 8B Mistral Ministral 3 8B family · base | Free* | Free* | 128K | — | — |
| Ministral 3 8B (Reasoning) Mistral Ministral 3 8B family · reasoning | Free* | Free* | 128K | — | — |
| Mistral Medium 3 Mistral | $0.40 | $2.00 | 128K | — | — |
| Mistral Small 4 Mistral Mistral Small 4 family · base | Free* | Free* | 256K | — | — |
| Mistral Small 4 (Reasoning) Mistral Mistral Small 4 family · reasoning | Free* | Free* | 256K | — | — |
| Nemotron 3 Super 120B A12B NVIDIA | Free* | Free* | 256K | — | — |
| o4-mini OpenAI | $1.10 | $4.40 | 200K | — | — |
| Qwen2.5 Coder 32B Instruct Alibaba Qwen2.5 Coder family · 32b-instruct | Free* | Free* | 128K | — | — |
| Qwen2.5-VL-32B Alibaba | Free* | Free* | 32K | — | — |
| Qwen3.5 Flash Alibaba Qwen3.5 Flash family · base | N/A | N/A | 1M | — | — |
| Qwen3.5 Plus Alibaba Qwen3.5 Plus family · base | N/A | N/A | 1M | — | — |
| Seed 1.6 ByteDance Seed 1.6 family · base | $0.25 | $2.00 | 256K | — | — |
| Seed 1.6 Flash ByteDance Seed 1.6 family · flash | $0.08 | $0.30 | 256K | — | — |
| Seed-2.0-Lite ByteDance Seed 2.0 family · lite | $0.25 | $2.00 | 256K | — | — |
| Seed-2.0-Mini ByteDance Seed 2.0 family · mini | $0.10 | $0.40 | 256K | — | — |
| Step 3.5 Flash StepFun | $0.10 | $0.30 | 256K | — | — |
| Trinity-Large-Thinking Arcee AI Trinity Large family · thinking | $0.25 | $0.90 | 512K | — | — |
* Score/$ = Overall benchmark score ÷ output price per million tokens. Higher is better. Free/open-weight models are excluded.
Estimate monthly spend based on token usage
Count tokens for any text across models
Estimate cost per blog post, page, or feature
Based on our benchmarks, models like DeepSeek V3, Gemini 3 Flash, and Gemini 3.1 Flash-Lite offer strong price-to-performance ratios. DeepSeek V3 scores competitively on benchmarks while costing a fraction of proprietary frontier models, and Google's Flash tiers are positioned for lower-cost high-volume usage. For free self-hosted options, Meta's Llama 4 models provide excellent performance at zero API cost.
GPT-5.4 is $2.50 input and $15.00 output per million tokens. GPT-5.4 Pro is $30.00 input and $180.00 output. That makes Pro 12x the cost on output tokens, so it only makes sense if the accuracy bump is worth a very real increase in spend.
Open-weight models like Llama 4, DeepSeek, and Qwen are free to download, but running them requires GPU infrastructure. Self-hosting costs vary from $0.50-$5.00/hour for cloud GPUs depending on model size. Many providers also offer API access to open models at lower prices than proprietary alternatives.
Gemini 3.1 Flash-Lite is among the cheapest proprietary APIs at $0.10/$0.40 per million tokens (input/output). Gemini 2.5 Flash remains very affordable at $0.15/$0.60, Gemini 3 Flash currently sits at $0.50/$3.00, and DeepSeek V3 is also inexpensive at $0.27/$1.10. For reasoning models, o3-mini and o4-mini offer budget-friendly options at $1.10/$4.40.
Get notified when new models drop, benchmark scores change, or the leaderboard shifts. One email per week.
Free. No spam. Unsubscribe anytime. We only store derived location metadata for consent routing.