Current OpenAI API pricing from official docs: GPT-5.4, GPT-5.2, GPT-5.1, cached input rates, Batch API discounts, and the pricing details that actually matter.
Share This Report
Copy the link, post it, or save a PDF version.
OpenAI's public pricing changed enough in 2026 that broad comparison tables age badly. This guide sticks to what OpenAI currently publishes in official docs: the live API pricing page for the GPT-5.4 family, and the GPT-5.2 launch page for GPT-5.2, GPT-5.1, and GPT-5 Pro pricing.
Use our cost calculator for quick estimates and our token counter to sanity-check prompt size before you ship.
| Model | Input $/M | Cached Input $/M | Output $/M |
|---|---|---|---|
| GPT-5.4 | $2.50 | $0.25 | $15.00 |
| GPT-5.4 mini | $0.75 | $0.075 | $4.50 |
| GPT-5.4 nano | $0.20 | $0.02 | $1.25 |
OpenAI notes that those rates are the standard processing rates for context lengths under 270K.
| Model | Input $/M | Cached Input $/M | Output $/M |
|---|---|---|---|
| GPT-5.2 | $1.75 | $0.175 | $14.00 |
| GPT-5.2 Pro | $21.00 | - | $168.00 |
| GPT-5.1 | $1.25 | $0.125 | $10.00 |
| GPT-5 Pro | $15.00 | - | $120.00 |
OpenAI also says it has no current plans to deprecate GPT-5.1, GPT-5, or GPT-4.1 in the API, which is why those older rows still matter for real production systems.
For most teams, the order is straightforward:
That is the practical answer. Most teams overspend by defaulting to the flagship tier before they have an eval set.
Assume a coding assistant handling 1,000 requests per day, with 2,000 input tokens and 500 output tokens per request.
That is why OpenAI's cheapest tier is worth testing first. The gap from GPT-5.4 to GPT-5.4 nano on this workload is more than 12x.
For the GPT-5.4 family, cached input is priced at exactly 10% of normal input cost:
If your app reuses the same system prompt, policy block, or large shared prefix, that discount matters immediately.
OpenAI's live pricing page says the Batch API cuts both input and output prices by 50%. That means the $375/month GPT-5.4 example above drops to roughly $187.50/month if the workload can tolerate asynchronous processing.
If you are generating large volumes of SEO pages, support summaries, or offline enrichment jobs, Batch is usually the first optimization to enable.
OpenAI's pricing page also says data residency and Regional Processing endpoints add 10% for all models released after March 5, 2026. If you have compliance-driven routing, do not ignore that surcharge when you model spend.
If you just want the shortest path to a sane OpenAI bill:
For a provider-level comparison across multiple vendors, see our LLM pricing overview.
Model pricing changes frequently. We send one email a week with what moved and why.
Share This Report
Copy the link, post it, or save a PDF version.
On this page
Which models moved up, what’s new, and what it costs. One email a week, 3-min read.
Free. One email per week.
Current Anthropic Claude API pricing from official model pages, including prompt caching, batch discounts, and the current 1M context beta notes.
Current DeepSeek API pricing from the official docs: deepseek-chat and deepseek-reasoner, cache-hit vs cache-miss pricing, output pricing, and the current V3.2 endpoint mapping.
Current Gemini API pricing from Google's official docs: 3.1 Pro Preview, 3.1 Flash-Lite Preview, 3 Flash Preview, 2.5 Flash, 2.5 Pro, plus Batch and Flex pricing.