LLM API Pricing Comparison 2026
Compare API pricing for every major LLM side by side. All prices are per million tokens. The Score/$column uses BenchLM's provisional overall score per dollar of output — higher means better value.
Last updated: May 13, 2026. Prices reflect official API rates. Open-weight model costs depend on your infrastructure. How does LLM token pricing work?
Ministral 3 3B
$0.1 in / $0.1 out per 1M tokens
DeepSeek V4 Flash (Max)
Score/$ ratio: 264.3
Claude Mythos Preview
Prov. score: 99 — $25 in / $125 out
| Model | Self-host est. | |||||
|---|---|---|---|---|---|---|
| Claude Mythos Preview Anthropic Claude Mythos family · preview | $25.00 | $125.00 | 1M | 99 | 0.8 | \u2014 |
| GPT-5.3 Codex OpenAI | $1.75 | $14.00 | 400K | ~87 | 6.6 | \u2014 |
| GPT-5.1-Codex-Max OpenAI | $1.25 | $10.00 | 400K | ~76 | 9.1 | \u2014 |
| GPT-5.2-Codex OpenAI | $1.75 | $14.00 | 400K | ~77 | 6.4 | \u2014 |
| GPT-5.4 Pro OpenAI GPT-5.4 family · pro | $30.00 | $180.00 | 1.05M | 91 | 0.5 | \u2014 |
| Grok 4.1 xAI | N/A | N/A | 1M | ~90 | — | \u2014 |
| Gemini 3.1 Pro | $2.00 | $12.00 | 1M | 92 | 7.4 | \u2014 |
| GPT-5.4 OpenAI GPT-5.4 family · base | $2.50 | $15.00 | 1.05M | 89 | 5.9 | \u2014 |
| GPT-5.5 OpenAI GPT-5.5 family · base | $5.00 | $30.00 | 1M | 91 | 3.0 | \u2014 |
| Gemini 3 Pro Deep Think Gemini 3 Pro family · reasoning | N/A | N/A | 2M | ~91 | — | \u2014 |
| Claude Opus 4.6 Anthropic | $5.00 | $25.00 | 1M | 87 | 3.5 | \u2014 |
| Claude Opus 4.7 (Adaptive) Anthropic Claude Opus 4.7 family · reasoning | $5.00 | $25.00 | 1M | 90 | 3.4 | \u2014 |
| DeepSeek V4 Pro (Max) DeepSeek DeepSeek V4 family · pro-reasoning | $1.74 | $3.48 | 1M | 88 | 24.4 | \u2014 |
| GPT-5 (medium) OpenAI GPT-5 family · reasoning | N/A | N/A | 128K | ~71 | — | \u2014 |
| GPT-5 (high) OpenAI GPT-5 family · reasoning | $1.25 | $10.00 | 400K | ~78 | 8.4 | \u2014 |
| GPT-5.1 OpenAI | $1.25 | $10.00 | 400K | ~79 | 8.3 | \u2014 |
| o1-preview OpenAI o1 family · snapshot | $15.00 | $60.00 | 200K | ~83 | 1.4 | \u2014 |
| Kimi 2.6 Moonshot AI Kimi 2.6 family · base | $0.95 | $4.00 | 256K | 85 | 20.5 | $18,221/mo |
| Claude Sonnet 4.6 Anthropic | $3.00 | $15.00 | 200K | 83 | 5.4 | \u2014 |
| GLM-5 (Reasoning) Z.AI GLM-5 family · reasoning | $1.00 | $3.20 | 200K | ~82 | 25.3 | \u2014 |
| Grok 4.1 Fast xAI | $0.20 | $0.50 | 2M | ~70 | 162.0 | \u2014 |
| DeepSeek V4 Pro (High) DeepSeek DeepSeek V4 family · pro-reasoning | $1.74 | $3.48 | 1M | 84 | 23.0 | \u2014 |
| GPT-5.2 OpenAI GPT-5.2 family · thinking | $1.75 | $14.00 | 400K | 81 | 5.6 | \u2014 |
| Qwen3.5 397B (Reasoning) Alibaba Qwen3.5 397B family · reasoning | $0.60 | $3.60 | 128K | ~79 | 21.7 | \u2014 |
| Claude Sonnet 4.5 Anthropic | $3.00 | $15.00 | 200K | ~66 | 5.1 | \u2014 |
| Gemini 3 Pro Gemini 3 Pro family · base | $2.00 | $12.00 | 2M | 81 | 6.3 | \u2014 |
| GLM-5.1 Z.AI GLM-5 family · flagship | $1.40 | $4.40 | 203K | 83 | 17.1 | $18,221/mo |
| DeepSeek V4 Flash (Max) DeepSeek DeepSeek V4 family · flash-reasoning | $0.14 | $0.28 | 1M | 76 | 264.3 | \u2014 |
| Claude Opus 4.5 Anthropic | $5.00 | $25.00 | 200K | 77 | 2.9 | \u2014 |
| Kimi K2.5 (Reasoning) Moonshot AI Kimi K2.5 family · reasoning | $0.60 | $3.00 | 256K | 76 | 24.0 | \u2014 |
| o3-pro OpenAI o3 family · pro | $20.00 | $80.00 | 200K | ~58 | 0.9 | \u2014 |
| Qwen3.6-27B Alibaba Qwen3.6-27B family · base | Free* | Free* | 262K | 74 | — | $429/mo |
| o3-mini OpenAI o3 family · mini | $1.10 | $4.40 | 200K | ~56 | 15.9 | \u2014 |
| o3 OpenAI o3 family · base | $2.00 | $8.00 | 200K | ~58 | 8.6 | \u2014 |
| GLM-4.7 Z.AI | Free* | Free* | 200K | ~69 | — | \u2014 |
| GLM-5 Z.AI GLM-5 family · base | $1.00 | $3.20 | 200K | 67 | 20.9 | \u2014 |
| MiMo-V2-Flash Xiaomi MiMo-V2-Flash family · base | Free* | Free* | 256K | ~60 | — | \u2014 |
| Qwen3.5-122B-A10B Alibaba | Free* | Free* | 262K | 65 | — | \u2014 |
| Qwen3.6 Plus Alibaba | N/A | N/A | 1M | 73 | — | \u2014 |
| DeepSeek V4 Flash (High) DeepSeek DeepSeek V4 family · flash-reasoning | $0.14 | $0.28 | 1M | 71 | 235.7 | \u2014 |
| o1 OpenAI o1 family · base | $15.00 | $60.00 | 200K | ~57 | 1.1 | \u2014 |
| GPT-4.1 OpenAI GPT-4.1 family · base | $2.00 | $8.00 | 1M | ~58 | 8.1 | \u2014 |
| Grok 4 xAI | N/A | N/A | 128K | ~65 | — | \u2014 |
| Gemini 2.5 Pro | $1.25 | $10.00 | 1M | ~65 | 6.4 | \u2014 |
| Kimi K2.5 Moonshot AI | $0.60 | $3.00 | 256K | 64 | 21.3 | $5,221/mo |
| Qwen2.5-1M Alibaba | Free* | Free* | 1M | ~51 | — | \u2014 |
| Qwen3.5 397B Alibaba | $0.60 | $3.60 | 128K | 64 | 17.8 | \u2014 |
| Qwen3.5-27B Alibaba | Free* | Free* | 262K | 63 | — | \u2014 |
| DeepSeek Coder 2.0 DeepSeek | N/A | N/A | 128K | ~52 | — | \u2014 |
| DeepSeek V4 Pro DeepSeek DeepSeek V4 family · pro | $1.74 | $3.48 | 1M | 70 | 17.8 | \u2014 |
| DeepSeekMath V2 DeepSeek DeepSeekMath family · snapshot | Free* | Free* | 128K | ~50 | — | \u2014 |
| Grok 4.20 xAI Grok 4.20 family · reasoning | $2.00 | $6.00 | 2M | 65 | 10.2 | \u2014 |
| Claude 4 Sonnet Anthropic | $3.00 | $15.00 | 200K | ~51 | 4.0 | \u2014 |
| Claude 4.1 Opus Anthropic Claude 4.1 Opus family · base | $15.00 | $75.00 | 200K | ~52 | 0.8 | \u2014 |
| DeepSeek V3.2 (Thinking) DeepSeek DeepSeek V3.2 family · reasoning | $0.55 | $2.19 | 128K | ~62 | 27.4 | \u2014 |
| Nemotron 3 Ultra 500B NVIDIA | Free* | Free* | 10M | ~47 | — | \u2014 |
| Qwen3.5-35B-A3B Alibaba Qwen3.5-35B-A3B family · base | Free* | Free* | 262K | 56 | — | \u2014 |
| Qwen2.5-72B Alibaba | Free* | Free* | 128K | ~50 | — | \u2014 |
| Kimi K2 Moonshot AI | $0.60 | $2.50 | 128K | ~42 | 22.8 | \u2014 |
| Claude Haiku 4.5 Anthropic | $1.00 | $5.00 | 200K | ~58 | 11.2 | \u2014 |
| DeepSeek V3.2 DeepSeek DeepSeek V3.2 family · base | $0.28 | $0.42 | 128K | ~58 | 133.3 | \u2014 |
| Gemini 3 Flash | $0.50 | $3.00 | 1M | ~65 | 18.7 | \u2014 |
| DeepSeek LLM 2.0 DeepSeek | Free* | Free* | 128K | ~51 | — | \u2014 |
| MiniMax M2.7 MiniMax | $0.30 | $1.20 | 200K | 62 | 45.0 | \u2014 |
| Claude 3.5 Sonnet Anthropic | $3.00 | $15.00 | 200K | ~41 | 3.5 | \u2014 |
| GPT-4.1 mini OpenAI GPT-4.1 family · mini | $0.40 | $1.60 | 1M | ~45 | 32.5 | \u2014 |
| Nemotron 3 Super 100B NVIDIA | Free* | Free* | 1M | ~44 | — | \u2014 |
| o4-mini (high) OpenAI o4-mini family · reasoning | N/A | N/A | 200K | ~44 | — | \u2014 |
| DeepSeek V4 Flash DeepSeek DeepSeek V4 family · flash | $0.14 | $0.28 | 1M | 59 | 182.1 | \u2014 |
| GPT-4o mini OpenAI GPT-4o family · mini | $0.15 | $0.60 | 128K | ~50 | 83.3 | \u2014 |
| Llama 3.1 405B Meta | Free* | Free* | 128K | ~41 | — | \u2014 |
| Mistral Large 3 Mistral | $0.50 | $1.50 | 256K | ~49 | 33.3 | $9,110/mo |
| Claude 4.1 Opus Thinking Anthropic Claude 4.1 Opus family · reasoning | N/A | N/A | 200K | ~44 | — | \u2014 |
| Grok Code Fast 1 xAI | $0.20 | $1.50 | 256K | ~40 | 32.7 | \u2014 |
| Mistral Large 2 Mistral | N/A | N/A | 128K | ~38 | — | \u2014 |
| Sarvam 105B Sarvam Sarvam 105B family · base | Free* | Free* | 128K | ~39 | — | \u2014 |
| GPT-4o OpenAI GPT-4o family · base | $2.50 | $10.00 | 128K | ~43 | 4.3 | \u2014 |
| DeepSeek V3 DeepSeek DeepSeek family · snapshot | $0.27 | $1.10 | 128K | ~36 | 36.4 | $18,221/mo |
| Gemini 1.5 Pro | $1.25 | $5.00 | 1M | ~36 | 7.8 | \u2014 |
| Gemini 3.1 Flash-Lite | $0.25 | $1.50 | 1M | ~48 | 26.0 | \u2014 |
| Phi-4 Microsoft | Free* | Free* | 16K | ~28 | — | \u2014 |
| Qwen3 235B 2507 (Reasoning) Alibaba Qwen3 235B 2507 family · reasoning | Free* | Free* | 128K | ~47 | — | \u2014 |
| Claude 3 Opus Anthropic | $15.00 | $75.00 | 200K | ~35 | 0.5 | \u2014 |
| DBRX Instruct Databricks DBRX family · instruct | Free* | Free* | 32K | ~33 | — | \u2014 |
| Gemini 2.5 Flash | $0.30 | $2.50 | 1M | ~38 | 13.2 | \u2014 |
| GPT-4.1 nano OpenAI GPT-4.1 family · nano | $0.10 | $0.40 | 1M | ~27 | 82.5 | \u2014 |
| Gemini 1.0 Pro | N/A | N/A | 32K | ~25 | — | \u2014 |
| Claude 3 Haiku Anthropic | $0.25 | $1.25 | 200K | ~24 | 24.8 | \u2014 |
| GPT-4 Turbo OpenAI | $10.00 | $30.00 | 128K | ~25 | 1.0 | \u2014 |
| GPT-OSS 120B OpenAI GPT-OSS family · base | Free* | Free* | 128K | ~35 | — | \u2014 |
| Mistral 8x7B Mistral Mistral 8x7B family · base | Free* | Free* | 32K | ~24 | — | \u2014 |
| Nemotron Ultra 253B NVIDIA | Free* | Free* | 32K | ~22 | — | \u2014 |
| Moonshot v1 Moonshot AI Moonshot family · snapshot | N/A | N/A | 128K | ~23 | — | \u2014 |
| Z-1 Z | N/A | N/A | 128K | ~24 | — | \u2014 |
| o1-pro OpenAI o1 family · pro | $150.00 | $600.00 | 200K | ~29 | 0.1 | \u2014 |
| DeepSeek R1 DeepSeek | $0.55 | $2.19 | 128K | ~33 | 11.9 | $18,221/mo |
| Mixtral 8x22B Instruct v0.1 Mistral Mixtral 8x22B family · instruct | Free* | Free* | 64K | ~23 | — | \u2014 |
| Nemotron-4 15B NVIDIA | Free* | Free* | 32K | ~23 | — | \u2014 |
| Llama 3 70B Meta | Free* | Free* | 128K | ~27 | — | \u2014 |
| Qwen3 235B 2507 Alibaba Qwen3 235B 2507 family · base | Free* | Free* | 128K | ~33 | — | \u2014 |
| Nemotron 3 Nano 30B NVIDIA | Free* | Free* | 32K | ~26 | — | \u2014 |
| Llama 4 Scout Meta | Free* | Free* | 10M | ~22 | — | $2,278/mo |
| Grok 3 [Beta] xAI Grok 3 family · snapshot | N/A | N/A | 128K | ~32 | — | \u2014 |
| Llama 4 Maverick Meta | Free* | Free* | 1M | ~17 | — | $2,610/mo |
| DeepSeek V3.1 (Reasoning) DeepSeek DeepSeek V3.1 family · reasoning | Free* | Free* | 128K | ~30 | — | \u2014 |
| Llama 4 Behemoth Meta | Free* | Free* | 32K | ~12 | — | \u2014 |
| Nova Pro Amazon | N/A | N/A | 128K | ~10 | — | \u2014 |
| Gemma 3 27B | Free* | Free* | 32K | ~17 | — | \u2014 |
| GLM-4.5 Z.AI | $0.60 | $2.20 | 128K | ~27 | 3.6 | \u2014 |
| GPT-OSS 20B OpenAI GPT-OSS family · mini | Free* | Free* | 128K | ~17 | — | \u2014 |
| DeepSeek V3.1 DeepSeek DeepSeek V3.1 family · base | Free* | Free* | 128K | ~26 | — | \u2014 |
| GLM-4.5-Air Z.AI | $0.20 | $1.10 | 128K | ~19 | 4.5 | \u2014 |
| Mistral 7B v0.3 Mistral Mistral 7B family · snapshot | Free* | Free* | 32K | ~5 | — | \u2014 |
| Mistral 8x7B v0.2 Mistral Mistral 8x7B family · snapshot | Free* | Free* | 32K | ~2 | — | \u2014 |
| 1-bit Bonsai 1.7B Prism ML 1-bit Bonsai family · 1-7b | Free* | Free* | 32K | — | — | \u2014 |
| 1-bit Bonsai 4B Prism ML 1-bit Bonsai family · 4b | Free* | Free* | 32K | — | — | \u2014 |
| 1-bit Bonsai 8B Prism ML 1-bit Bonsai family · 8b | Free* | Free* | 64K | — | — | \u2014 |
| Aion-2.0 Aion Labs | $0.80 | $1.60 | 128K | — | — | \u2014 |
| Claude Opus 4.7 Anthropic Claude Opus 4.7 family · base | $5.00 | $25.00 | 1M | — | — | \u2014 |
| Composer 2 Cursor Composer family · base | $0.50 | $2.50 | 200K | — | — | \u2014 |
| DeepSeek R1 Distill Qwen 32B DeepSeek DeepSeek R1 Distill family · qwen-32b | Free* | Free* | 128K | — | — | \u2014 |
| DeepSeek V4 Flash Base DeepSeek DeepSeek V4 family · base | N/A | N/A | 1M | — | — | \u2014 |
| DeepSeek V4 Pro Base DeepSeek DeepSeek V4 family · base | N/A | N/A | 1M | — | — | \u2014 |
| Gemma 4 26B A4B Gemma 4 family · 26b-a4b | Free* | Free* | 256K | — | — | \u2014 |
| Gemma 4 31B Gemma 4 family · 31b | Free* | Free* | 256K | — | — | $429/mo |
| Gemma 4 E2B Gemma 4 family · e2b | Free* | Free* | 128K | — | — | \u2014 |
| Gemma 4 E4B Gemma 4 family · e4b | Free* | Free* | 128K | — | — | \u2014 |
| GLM-4.7-Flash Z.AI | Free* | Free* | 200K | — | — | \u2014 |
| GLM-5-Turbo Z.AI GLM-5 family · turbo | $1.20 | $4.00 | 200K | — | — | \u2014 |
| GLM-5V-Turbo Z.AI GLM-5 family · vision-turbo | $1.20 | $4.00 | 200K | — | — | \u2014 |
| GPT-5 mini OpenAI GPT-5 family · mini | $0.25 | $2.00 | 128K | — | — | \u2014 |
| GPT-5 nano OpenAI GPT-5 family · nano | $0.05 | $0.40 | 400K | — | — | \u2014 |
| GPT-5.2 Instant OpenAI GPT-5.2 family · instant | $1.50 | $6.00 | 128K | — | — | \u2014 |
| GPT-5.2 Pro OpenAI GPT-5.2 family · pro | $25.00 | $150.00 | 400K | — | — | \u2014 |
| GPT-5.3 Instant OpenAI GPT-5.3 family · instant | $1.75 | $14.00 | 128K | — | — | \u2014 |
| GPT-5.3-Codex-Spark OpenAI GPT-5.3 Codex family · spark | N/A | N/A | 256K | — | — | \u2014 |
| GPT-5.4 mini OpenAI GPT-5.4 family · mini | $0.75 | $4.50 | 400K | — | — | \u2014 |
| GPT-5.4 nano OpenAI GPT-5.4 family · nano | $0.20 | $1.25 | 400K | — | — | \u2014 |
| GPT-5.5 Pro OpenAI GPT-5.5 family · pro | $30.00 | $180.00 | 1M | — | — | \u2014 |
| Granite-4.0-1B IBM Granite 4.0 1B family · dense | Free* | Free* | 128K | — | — | \u2014 |
| Granite-4.0-350M IBM Granite 4.0 350M family · dense | Free* | Free* | 32K | — | — | \u2014 |
| Granite-4.0-H-1B IBM Granite 4.0 1B family · hybrid | Free* | Free* | 128K | — | — | \u2014 |
| Granite-4.0-H-350M IBM Granite 4.0 350M family · hybrid | Free* | Free* | 32K | — | — | \u2014 |
| Grok 3 Mini xAI | $0.30 | $0.50 | 128K | — | — | \u2014 |
| Grok 4.20 Multi-agent xAI Grok 4.20 family · multi-agent | N/A | N/A | 2M | — | — | \u2014 |
| Grok 4.3 xAI | $1.25 | $2.50 | 1M | — | — | \u2014 |
| Holo3-122B-A10B H Company Holo3 family · 122b-a10b | N/A | N/A | 64K | — | — | \u2014 |
| Holo3-35B-A3B H Company Holo3 family · 35b-a3b | N/A | N/A | 64K | — | — | \u2014 |
| Hy3 Preview Tencent Hy3 family · preview | Free* | Free* | 256K | — | — | \u2014 |
| Interfaze Beta Interfaze Interfaze family · beta | $1.50 | $3.50 | 1M | — | — | \u2014 |
| Laguna M.1 Poolside Laguna family · m-1 | Free* | Free* | 131K | — | — | \u2014 |
| Laguna XS.2 Poolside Laguna family · xs-2 | Free* | Free* | 131K | — | — | \u2014 |
| Leanstral Mistral | Free* | Free* | 256K | — | — | \u2014 |
| LFM2-24B-A2B LiquidAI | Free* | Free* | 32K | — | — | \u2014 |
| LFM2.5-1.2B-Instruct LiquidAI LFM2.5 1.2B family · instruct | Free* | Free* | 32K | — | — | \u2014 |
| LFM2.5-1.2B-Thinking LiquidAI LFM2.5 1.2B family · reasoning | Free* | Free* | 32K | — | — | \u2014 |
| LFM2.5-350M LiquidAI | Free* | Free* | 32K | — | — | \u2014 |
| LFM2.5-VL-450M LiquidAI | Free* | Free* | 128K | — | — | \u2014 |
| Ling 2.6 Flash InclusionAI Ling 2.6 family · flash | N/A | N/A | 262K | — | — | \u2014 |
| Mercury 2 Inception | $0.25 | $0.75 | 128K | — | — | \u2014 |
| MiMo-V2.5 Xiaomi MiMo-V2.5 family · base | N/A | N/A | 1M | — | — | \u2014 |
| MiMo-V2.5-Pro Xiaomi MiMo-V2.5 family · pro | N/A | N/A | 1M | — | — | \u2014 |
| MiniMax M1 80k MiniMax | N/A | N/A | 80K | — | — | \u2014 |
| MiniMax M2.5 MiniMax | $0.30 | $1.20 | 128K | — | — | \u2014 |
| Ministral 3 14B Mistral Ministral 3 14B family · base | $0.20 | $0.20 | 256K | — | — | \u2014 |
| Ministral 3 14B (Reasoning) Mistral Ministral 3 14B family · reasoning | $0.20 | $0.20 | 256K | — | — | \u2014 |
| Ministral 3 3B Mistral Ministral 3 3B family · base | $0.10 | $0.10 | 256K | — | — | \u2014 |
| Ministral 3 3B (Reasoning) Mistral Ministral 3 3B family · reasoning | $0.10 | $0.10 | 256K | — | — | \u2014 |
| Ministral 3 8B Mistral Ministral 3 8B family · base | $0.15 | $0.15 | 256K | — | — | \u2014 |
| Ministral 3 8B (Reasoning) Mistral Ministral 3 8B family · reasoning | $0.15 | $0.15 | 256K | — | — | \u2014 |
| Mistral Medium 3 Mistral | $0.40 | $2.00 | 128K | — | — | \u2014 |
| Mistral Medium 3.5 128B Mistral Mistral Medium 3.5 family · 128b | $1.50 | $7.50 | 256K | — | — | \u2014 |
| Mistral Small 4 Mistral Mistral Small 4 family · base | $0.15 | $0.60 | 256K | — | — | $2,278/mo |
| Mistral Small 4 (Reasoning) Mistral Mistral Small 4 family · reasoning | $0.15 | $0.60 | 256K | — | — | \u2014 |
| Nemotron 3 Nano Omni 30B A3B NVIDIA Nemotron 3 Nano Omni family · 30b-a3b | Free* | Free* | 256K | — | — | \u2014 |
| Nemotron 3 Super 120B A12B NVIDIA | Free* | Free* | 256K | — | — | \u2014 |
| o4-mini OpenAI | $1.10 | $4.40 | 200K | — | — | \u2014 |
| Qwen2.5 Coder 32B Instruct Alibaba Qwen2.5 Coder family · 32b-instruct | Free* | Free* | 128K | — | — | \u2014 |
| Qwen2.5-VL-32B Alibaba | Free* | Free* | 32K | — | — | \u2014 |
| Qwen3.5 Flash Alibaba | $0.10 | $0.40 | 1M | — | — | \u2014 |
| Qwen3.5 Plus Alibaba | $0.40 | $2.40 | 1M | — | — | \u2014 |
| Sarvam 30B Sarvam Sarvam 30B family · base | Free* | Free* | 64K | — | — | \u2014 |
| Seed 1.6 ByteDance Seed 1.6 family · base | N/A | N/A | 256K | — | — | \u2014 |
| Seed 1.6 Flash ByteDance Seed 1.6 family · flash | N/A | N/A | 256K | — | — | \u2014 |
| Seed-2.0-Lite ByteDance Seed 2.0 family · lite | N/A | N/A | 256K | — | — | \u2014 |
| Seed-2.0-Mini ByteDance Seed 2.0 family · mini | N/A | N/A | 256K | — | — | \u2014 |
| Step 3.5 Flash StepFun | $0.10 | $0.30 | 256K | — | — | \u2014 |
| Ternary Bonsai 1.7B Prism ML Ternary Bonsai family · 1-7b | Free* | Free* | 32K | — | — | \u2014 |
| Ternary Bonsai 4B Prism ML Ternary Bonsai family · 4b | Free* | Free* | 32K | — | — | \u2014 |
| Ternary Bonsai 8B Prism ML Ternary Bonsai family · 8b | Free* | Free* | 64K | — | — | \u2014 |
| Trinity-Large-Preview Arcee AI Trinity Large family · preview | $0.25 | $1.00 | 512K | — | — | \u2014 |
| Trinity-Large-Thinking Arcee AI Trinity Large family · thinking | $0.25 | $0.90 | 512K | — | — | \u2014 |
| ZAYA1-74B-Preview Zyphra ZAYA1 family · 74b-preview | Free* | Free* | 256K | — | — | \u2014 |
| ZAYA1-8B Zyphra ZAYA1 family · 8b | Free* | Free* | 131K | — | — | \u2014 |
* Score/$ = Overall benchmark score ÷ output price per million tokens. Higher is better. Free/open-weight models are excluded.
Interactive scatter chart with efficiency frontier
Estimate cost per blog post, page, or feature
Count tokens for any text across models
Best Value by Category
Frequently Asked Questions
Which LLM has the best price-to-performance ratio?
Based on our benchmarks, models like DeepSeek V3, Gemini 3 Flash, and Gemini 3.1 Flash-Lite offer strong price-to-performance ratios. DeepSeek V3 scores competitively on benchmarks while costing a fraction of proprietary frontier models, and Google's Flash tiers are positioned for lower-cost high-volume usage. For free self-hosted options, Meta's Llama 4 models provide excellent performance at zero API cost.
How much more expensive is GPT-5.4 Pro than GPT-5.4?
GPT-5.4 is $2.50 input and $15.00 output per million tokens. GPT-5.4 Pro is $30.00 input and $180.00 output. That makes Pro 12x the cost on output tokens, so it only makes sense if the accuracy bump is worth a very real increase in spend.
Are open-source LLMs really free?
Open-weight models like Llama 4, DeepSeek, and Qwen are free to download, but running them requires GPU infrastructure. Self-hosting costs vary from $0.50-$5.00/hour for cloud GPUs depending on model size. Many providers also offer API access to open models at lower prices than proprietary alternatives.
What is the cheapest LLM API in 2026?
Gemini 3.1 Flash-Lite is among the cheapest proprietary APIs at $0.10/$0.40 per million tokens (input/output). Gemini 2.5 Flash remains very affordable at $0.15/$0.60, Gemini 3 Flash currently sits at $0.50/$3.00, and DeepSeek V3 is also inexpensive at $0.27/$1.10. For reasoning models, o3-mini and o4-mini offer budget-friendly options at $1.10/$4.40.
The AI models change fast. We track them for you.
For engineers, researchers, and the plain curious — a weekly brief on new models, ranking shifts, and pricing changes.
Free. No spam. Unsubscribe anytime.