Full LLM API pricing comparison for 2026 — input/output token costs for GPT-5, Claude, Gemini, DeepSeek, Grok, and more. Find the cheapest model for your use case.
Share This Report
Copy the link, post it, or save a PDF version.
GPT-5 nano is the cheapest major LLM API at $0.05 per million input tokens. GPT-5.4 Pro is the most expensive at $30/$180. Claude Opus 4.6 costs $5/$25. For most production workloads, GPT-5.4 at $2.50/$15 still hits one of the best balances of capability and cost.
Pricing varies by more than 600x across major LLM APIs — from $0.05 to $30 per million input tokens. The right model for your workload depends on the task, volume, and how much quality you're trading for cost. This guide covers current pricing for every major model and breaks down the math for the most common use cases.
All prices are per million tokens. Check the BenchLM.ai pricing page for live pricing — rates change frequently.
| Model | Creator | Input | Output | Overall Score |
|---|---|---|---|---|
| GPT-5 nano | OpenAI | $0.05 | $0.40 | — |
| Gemini 3.1 Flash-Lite | $0.25 | $1.50 | — | |
| DeepSeek V3 | DeepSeek | $0.27 | $1.10 | — |
| DeepSeek Coder 2.0 | DeepSeek | $0.27 | $1.10 | — |
| Grok 3 Mini | xAI | $0.30 | $0.50 | — |
| Gemini 3 Flash | $0.50 | $3.00 | — | |
| DeepSeek R1 | DeepSeek | $0.55 | $2.19 | — |
| Gemini 3.1 Pro | $2.00 | $12.00 | 94 | |
| GPT-5.1 | OpenAI | $1.50 | $6.00 | 67 |
| GPT-5.2 Instant | OpenAI | $1.50 | $6.00 | 64 |
| GPT-5.3 Instant | OpenAI | $1.75 | $14.00 | 65 |
| GPT-5.2 | OpenAI | $1.75 | $14.00 | 77 |
| GPT-5.2-Codex | OpenAI | $1.75 | $14.00 | 73 |
| GPT-5.3-Codex-Spark | OpenAI | $2.00 | $8.00 | 63 |
| GPT-5.3 Codex | OpenAI | $2.50 | $10.00 | 80 |
| GPT-5.4 | OpenAI | $2.50 | $15.00 | 94 |
| Claude Sonnet 4.6 | Anthropic | $3.00 | $15.00 | 68 |
| Grok 4.1 | xAI | — | — | 76 |
| Mistral Large 3 | Mistral | $0.50 | $1.50 | — |
| Claude Opus 4.6 | Anthropic | $5.00 | $25.00 | 85 |
| GPT-5.2 Pro | OpenAI | $25.00 | $150.00 | 66 |
| GPT-5.4 Pro | OpenAI | $30.00 | $180.00 | 91 |
Benchmark scores from BenchLM.ai leaderboard. Prices per million tokens.
Under $0.50/M input — Nano and flash models. GPT-5 nano, Gemini 3.1 Flash-Lite, DeepSeek V3, Grok 3 Mini. Best for high-volume, lower-stakes tasks: classification, summarization, simple Q&A. Quality varies significantly.
$1-3/M input — The production sweet spot. Gemini 3.1 Pro ($2.00), GPT-5.1 ($1.50), GPT-5.4 ($2.50), Claude Sonnet 4.6 ($3.00). Strong frontier performance at reasonable cost. Most teams live here.
$5-30/M input — Flagship tier. Claude Opus 4.6 ($5), GPT-5.2 Pro ($25), GPT-5.4 Pro ($30). Reserved for tasks where the extra capability is worth the price — legal analysis, complex research, high-stakes decisions.
At $2.50/M input, GPT-5.4 gives you ~400K input tokens per month per $1 of input budget. For a typical chat application averaging 500 input tokens per message, that's 800 conversations per dollar. At that scale, GPT-5.4 and Claude Sonnet 4.6 ($3.00) are both reasonable choices.
If you're handling 10M+ tokens/month, the difference between $2.50 and $15.00/M input tokens becomes $125K/year at that volume. That's where the flagship vs mid-tier decision really matters.
For a coding assistant or IDE integration:
Assuming a 10-page document ≈ 4,000 tokens input, 500 tokens output:
| Model | Cost per doc |
|---|---|
| Gemini 3.1 Flash-Lite | $0.0018 |
| DeepSeek V3 | $0.0014 |
| Gemini 3.1 Pro | $0.0140 |
| GPT-5.4 | $0.0175 |
| Claude Sonnet 4.6 | $0.0195 |
| Claude Opus 4.6 | $0.098 |
For document pipelines processing thousands of documents per day, model selection has a direct P&L impact. Gemini 3.1 Pro at about $0.014/doc vs Claude Opus at about $0.045/doc is still a meaningful cost difference.
If you need GPT-5-class quality without GPT-5 pricing:
Gemini 3.1 Pro ($2/$12) is tied with GPT-5.4 at 94 overall while still costing slightly less. For many production workloads, that remains one of the clearest value plays in the frontier tier.
Claude Sonnet 4.6 ($3/$15) scores 68 overall — lower than GPT-5.4, but still a viable option for writing, coding, and structured output if you prefer Anthropic's ecosystem.
GPT-5.2 ($1.75/$14) scores 77 overall — lower than GPT-5.4, but still a viable value row for teams that care about cost more than absolute frontier standing.
DeepSeek V3 at $0.27/$1.10 vs GPT-5.4 at $2.50/$15:
For a pipeline generating 1M output tokens per day:
The question is whether GPT-5.4's benchmark advantage justifies the 14x output cost premium. For general text generation, creative writing, and many coding tasks — probably not. For hard reasoning, agentic workflows, and tasks requiring frontier-level reliability — the benchmark gap is real.
DeepSeek R1 (the reasoning model at $0.55/$2.19) vs GPT-5.4 Pro ($30/$180) is an even starker comparison: ~80x cheaper on output tokens. The quality gap on hard reasoning is real but not 80x worth of quality.
Free tier / experiments: GPT-5 nano, Gemini 3.1 Flash-Lite Serious prototypes: DeepSeek V3, Gemini 3 Flash Small production apps (under $500/mo): Gemini 3.1 Pro, GPT-5.1, Claude Sonnet 4.6 Scale production (high volume): GPT-5.4, GPT-5.2 Enterprise / high-stakes workflows: Claude Opus 4.6, GPT-5.4 Pro
→ Use the Cost Calculator to estimate your monthly spend · Full pricing table
What is the cheapest LLM API in 2026? GPT-5 nano at $0.05/$0.40 per million tokens. Gemini 3.1 Flash-Lite at $0.25/$1.50 follows. For better quality at still-cheap pricing, DeepSeek V3 at $0.27/$1.10 is the standout budget option.
How much does GPT-5.4 cost? $2.50 per million input tokens, $15 per million output tokens. GPT-5.4 Pro costs $30/$180 — more than 10x higher. For most teams, standard GPT-5.4 is the better value.
How much does Claude Opus 4.6 cost? $5 per million input tokens, $25 per million output tokens. Claude Sonnet 4.6 at $3/$15 is still materially cheaper for most tasks.
Is DeepSeek cheaper than GPT-5? Yes, by a large margin. DeepSeek V3 at $0.27/$1.10 is 9x cheaper on input and 14x cheaper on output than GPT-5.4. For high-volume workloads where the quality gap is acceptable, the savings are substantial.
What is the best value LLM in 2026? Gemini 3.1 Pro ($2/$12) for one of the best capability-to-cost ratios in the frontier tier. It is tied with GPT-5.4 at 94 overall while still costing slightly less. DeepSeek V3 ($0.27/$1.10) remains the budget-first pick.
Prices current as of March 2026. Check BenchLM.ai/pricing for the latest rates.
Model pricing changes frequently. We send one email a week with what moved and why.
Share This Report
Copy the link, post it, or save a PDF version.
On this page
Which models moved up, what’s new, and what it costs. One email a week, 3-min read.
Free. One email per week.
Current Anthropic Claude API pricing from official model pages and the Claude Opus 4.7 launch announcement, including prompt caching, batch discounts, and current long-context notes.
Current DeepSeek API pricing from the official docs: deepseek-chat and deepseek-reasoner, cache-hit vs cache-miss pricing, output pricing, and the current V3.2 endpoint mapping.
Current Gemini API pricing from Google's official docs: 3.1 Pro Preview, 3.1 Flash-Lite Preview, 3 Flash Preview, 2.5 Flash, 2.5 Pro, plus Batch and Flex pricing.