Compare API pricing for every major LLM. Prices are per million tokens. According to BenchLM.ai, the best price-to-performance ratios come from models like DeepSeek V3 and Gemini Flash.
Last updated: March 7, 2026. Prices reflect official API rates. Self-hosted open-weight model costs depend on your infrastructure.
LLM API pricing varies by over 100x between the cheapest and most expensive models. At the low end, models like Gemini 3.0 Flash and DeepSeek V3 cost fractions of a cent per request. At the high end, frontier reasoning models like Claude Opus 4.6 and o3 can cost dollars per complex query.
The Score/$ column shows benchmark performance per dollar spent on output tokens — higher means better value. This helps identify models that deliver strong benchmark results without breaking the budget. Open-weight models like Llama 4 and DeepSeek are excluded from this ratio since their API pricing is zero (though self-hosting has separate infrastructure costs).
For most teams, the choice comes down to whether you need absolute frontier performance or can accept 90% of the quality at 10% of the cost. Use our cost calculator to model your specific usage, or take the LLM selector quiz to find the best model for your use case and budget.
| Model | |||||
|---|---|---|---|---|---|
| Llama 4 Maverick Meta | Free* | Free* | 1M | 42 | — |
| Llama 4 Scout Meta | Free* | Free* | 10M | 42 | — |
| Qwen3.5 235B Alibaba | Free* | Free* | 128K | — | — |
| Qwen3.5 397B Alibaba | Free* | Free* | 128K | 71 | — |
| Grok 3 Mini xAI | $0.30 | $0.50 | 128K | — | — |
| Gemini 3.0 Flash | $0.15 | $0.60 | 1M | — | — |
| Gemini 2.5 Flash | $0.15 | $0.60 | 1M | 49 | 81.7 |
| DeepSeek V3 DeepSeek | $0.27 | $1.10 | 128K | — | — |
| DeepSeek Coder 2.0 DeepSeek | $0.27 | $1.10 | 128K | 73 | 66.4 |
| Mistral Medium 3 Mistral | $0.40 | $2.00 | 128K | — | — |
| DeepSeek R1 DeepSeek | $0.55 | $2.19 | 128K | — | — |
| Claude Haiku 4.5 Anthropic | $0.80 | $4.00 | 200K | 64 | 16.0 |
| o3-mini OpenAI | $1.10 | $4.40 | 200K | — | — |
| o4-mini OpenAI | $1.10 | $4.40 | 200K | — | — |
| Gemini 3.1 Pro | $1.25 | $5.00 | 1M | 89 | 17.8 |
| Gemini 2.5 Pro | $1.25 | $5.00 | 1M | 72 | 14.4 |
| GPT-5.1 OpenAI | $1.50 | $6.00 | 200K | 85 | 14.2 |
| Mistral Large 3 Mistral | $2.00 | $6.00 | 128K | 68 | 11.3 |
| GPT-5.2 OpenAI | $2.00 | $8.00 | 400K | 91 | 11.4 |
| GPT-5.1-Codex-Max OpenAI | $2.00 | $8.00 | 400K | 87 | 10.9 |
| GPT-5.2-Codex OpenAI | $2.00 | $8.00 | 400K | 88 | 11.0 |
| GPT-5.4 OpenAI | $2.50 | $10.00 | 1M | 91 | 9.1 |
| GPT-5.3 Codex OpenAI | $2.50 | $10.00 | 400K | 92 | 9.2 |
| GPT-4o OpenAI | $2.50 | $10.00 | 128K | 60 | 6.0 |
| Claude Sonnet 4.6 Anthropic | $3.00 | $15.00 | 200K | 86 | 5.7 |
| Claude Sonnet 4.5 Anthropic | $3.00 | $15.00 | 200K | 83 | 5.5 |
| Grok 4.1 xAI | $3.00 | $15.00 | 1M | 89 | 5.9 |
| o3 OpenAI | $10.00 | $40.00 | 200K | 76 | 1.9 |
| o1 OpenAI | $15.00 | $60.00 | 200K | — | — |
| Claude Opus 4.6 Anthropic | $15.00 | $75.00 | 1M | 90 | 1.2 |
* Score/$ = Overall benchmark score divided by output price per million tokens. Higher is better. Free/open-weight models are excluded from this ratio.
Use our cost calculator to estimate monthly spending based on your usage patterns.
Open Cost CalculatorBased on our benchmarks, models like DeepSeek V3 and Gemini 3.0 Flash offer the best price-to-performance ratio. DeepSeek V3 scores competitively on benchmarks while costing a fraction of proprietary frontier models. For free self-hosted options, Meta's Llama 4 models provide excellent performance at zero API cost.
GPT-5.4 costs $2.50 per million input tokens and $10.00 per million output tokens. Claude Opus 4.6 costs $15.00 per million input tokens and $75.00 per million output tokens. GPT-5.4 is significantly cheaper per token, though both models score within a few points of each other on most benchmarks.
Open-weight models like Llama 4, DeepSeek, and Qwen are free to download, but running them requires GPU infrastructure. Self-hosting costs vary from $0.50-$5.00/hour for cloud GPUs depending on model size. Many providers also offer API access to open models at lower prices than proprietary alternatives.
Gemini 3.0 Flash and Gemini 2.5 Flash are among the cheapest at $0.15/$0.60 per million tokens (input/output). DeepSeek V3 is also very affordable at $0.27/$1.10. For reasoning models, o3-mini and o4-mini offer budget-friendly options at $1.10/$4.40.
Get notified when new models drop, benchmark scores change, or the leaderboard shifts. One email per week.
Free. No spam. Unsubscribe anytime.