LLM Pricing Comparison 2026

Compare API pricing for every major LLM. Prices are per million tokens. According to BenchLM.ai, the best price-to-performance ratios come from models like DeepSeek V3 and Gemini Flash.

Last updated: March 7, 2026. Prices reflect official API rates. Self-hosted open-weight model costs depend on your infrastructure.

LLM API pricing varies by over 100x between the cheapest and most expensive models. At the low end, models like Gemini 3.0 Flash and DeepSeek V3 cost fractions of a cent per request. At the high end, frontier reasoning models like Claude Opus 4.6 and o3 can cost dollars per complex query.

The Score/$ column shows benchmark performance per dollar spent on output tokens — higher means better value. This helps identify models that deliver strong benchmark results without breaking the budget. Open-weight models like Llama 4 and DeepSeek are excluded from this ratio since their API pricing is zero (though self-hosting has separate infrastructure costs).

For most teams, the choice comes down to whether you need absolute frontier performance or can accept 90% of the quality at 10% of the cost. Use our cost calculator to model your specific usage, or take the LLM selector quiz to find the best model for your use case and budget.

Model
Llama 4 Maverick

Meta

Free*Free*1M42
Llama 4 Scout

Meta

Free*Free*10M42
Qwen3.5 235B

Alibaba

Free*Free*128K
Qwen3.5 397B

Alibaba

Free*Free*128K71
Grok 3 Mini

xAI

$0.30$0.50128K
Gemini 3.0 Flash

Google

$0.15$0.601M
Gemini 2.5 Flash

Google

$0.15$0.601M4981.7
DeepSeek V3

DeepSeek

$0.27$1.10128K
DeepSeek Coder 2.0

DeepSeek

$0.27$1.10128K7366.4
Mistral Medium 3

Mistral

$0.40$2.00128K
DeepSeek R1

DeepSeek

$0.55$2.19128K
Claude Haiku 4.5

Anthropic

$0.80$4.00200K6416.0
o3-mini

OpenAI

$1.10$4.40200K
o4-mini

OpenAI

$1.10$4.40200K
Gemini 3.1 Pro

Google

$1.25$5.001M8917.8
Gemini 2.5 Pro

Google

$1.25$5.001M7214.4
GPT-5.1

OpenAI

$1.50$6.00200K8514.2
Mistral Large 3

Mistral

$2.00$6.00128K6811.3
GPT-5.2

OpenAI

$2.00$8.00400K9111.4
GPT-5.1-Codex-Max

OpenAI

$2.00$8.00400K8710.9
GPT-5.2-Codex

OpenAI

$2.00$8.00400K8811.0
GPT-5.4

OpenAI

$2.50$10.001M919.1
GPT-5.3 Codex

OpenAI

$2.50$10.00400K929.2
GPT-4o

OpenAI

$2.50$10.00128K606.0
Claude Sonnet 4.6

Anthropic

$3.00$15.00200K865.7
Claude Sonnet 4.5

Anthropic

$3.00$15.00200K835.5
Grok 4.1

xAI

$3.00$15.001M895.9
o3

OpenAI

$10.00$40.00200K761.9
o1

OpenAI

$15.00$60.00200K
Claude Opus 4.6

Anthropic

$15.00$75.001M901.2

* Score/$ = Overall benchmark score divided by output price per million tokens. Higher is better. Free/open-weight models are excluded from this ratio.

Need a detailed cost estimate?

Use our cost calculator to estimate monthly spending based on your usage patterns.

Open Cost Calculator

Frequently Asked Questions

Which LLM has the best price-to-performance ratio?

Based on our benchmarks, models like DeepSeek V3 and Gemini 3.0 Flash offer the best price-to-performance ratio. DeepSeek V3 scores competitively on benchmarks while costing a fraction of proprietary frontier models. For free self-hosted options, Meta's Llama 4 models provide excellent performance at zero API cost.

How much does it cost to use GPT-5.4 vs Claude Opus 4.6?

GPT-5.4 costs $2.50 per million input tokens and $10.00 per million output tokens. Claude Opus 4.6 costs $15.00 per million input tokens and $75.00 per million output tokens. GPT-5.4 is significantly cheaper per token, though both models score within a few points of each other on most benchmarks.

Are open-source LLMs really free?

Open-weight models like Llama 4, DeepSeek, and Qwen are free to download, but running them requires GPU infrastructure. Self-hosting costs vary from $0.50-$5.00/hour for cloud GPUs depending on model size. Many providers also offer API access to open models at lower prices than proprietary alternatives.

What is the cheapest LLM API in 2026?

Gemini 3.0 Flash and Gemini 2.5 Flash are among the cheapest at $0.15/$0.60 per million tokens (input/output). DeepSeek V3 is also very affordable at $0.27/$1.10. For reasoning models, o3-mini and o4-mini offer budget-friendly options at $1.10/$4.40.

Weekly LLM Benchmark Digest

Get notified when new models drop, benchmark scores change, or the leaderboard shifts. One email per week.

Free. No spam. Unsubscribe anytime.