Current Gemini API pricing from Google's official docs: 3.1 Pro Preview, 3.1 Flash-Lite Preview, 3 Flash Preview, 2.5 Flash, 2.5 Pro, plus Batch and Flex pricing.
Share This Report
Copy the link, post it, or save a PDF version.
Google's Gemini pricing is more nuanced than a single provider table can capture. The current official docs split pricing by model, service tier (Standard, Batch, Flex, Priority), and in some cases prompt size. That is why older one-line summaries often get Gemini wrong.
This guide uses the current official Gemini pricing page and Gemini rate-limit page.
| Tier | Input $/M | Output $/M | Notes |
|---|---|---|---|
| Standard | $2.00 <= 200K prompt, $4.00 > 200K | $12.00 <= 200K, $18.00 > 200K | No free tier listed |
| Batch | $1.00 <= 200K, $2.00 > 200K | $6.00 <= 200K, $9.00 > 200K | Half-price async tier |
| Flex | $1.00 <= 200K, $2.00 > 200K | $6.00 <= 200K, $9.00 > 200K | Lower-priority tier |
| Tier | Input $/M | Output $/M |
|---|---|---|
| Standard | $0.25 | $1.50 |
| Batch | $0.125 | $0.75 |
| Flex | $0.125 | $0.75 |
Google explicitly describes Flash-Lite as its most cost-efficient model and notes that preview models may change before becoming stable.
| Tier | Input $/M | Output $/M |
|---|---|---|
| Standard | $0.50 | $3.00 |
| Batch | $0.25 | $1.50 |
| Flex | $0.25 | $1.50 |
| Tier | Input $/M | Output $/M |
|---|---|---|
| Standard | $0.30 | $2.50 |
| Batch | $0.15 | $1.25 |
| Flex | $0.15 | $1.25 |
| Tier | Input $/M | Output $/M | Notes |
|---|---|---|---|
| Standard | $1.25 <= 200K prompt, $2.50 > 200K | $10.00 <= 200K, $15.00 > 200K | Free tier listed |
| Batch | $0.625 <= 200K, $1.25 > 200K | $5.00 <= 200K, $7.50 > 200K | Half-price async tier |
| Flex | $0.625 <= 200K, $1.25 > 200K | $5.00 <= 200K, $7.50 > 200K | Lower-priority tier |
Three things matter:
Assume 5,000 requests per day, each with 2,500 input tokens and 400 output tokens.
This prompt is well under 200K tokens, so the lower Pro bracket applies:
That is the real current shape of Gemini pricing: cheap Flash tiers, a middle 2.5 Pro tier, and a more expensive 3.1 Pro Preview tier.
Google's current rate-limit docs say:
So do not hard-code old blog-post numbers like "Flash is 15 RPM and Pro is 2 RPM" into planning docs. That is not how Google documents Gemini limits today.
For Gemini, cost control is mostly about tier selection and service tier selection:
For a multi-provider comparison, see our LLM pricing overview.
Model pricing changes frequently. We send one email a week with what moved and why.
Share This Report
Copy the link, post it, or save a PDF version.
On this page
Which models moved up, what’s new, and what it costs. One email a week, 3-min read.
Free. One email per week.
Learn how LLM API pricing works — from tokens, input/output costs, and reasoning tokens to vision, embedding, and fine-tuning pricing. Includes real cost examples, free tiers, and 6 strategies to cut your AI spend.
Current Anthropic Claude API pricing from official model pages, including prompt caching, batch discounts, and the current 1M context beta notes.
Current DeepSeek API pricing from the official docs: deepseek-chat and deepseek-reasoner, cache-hit vs cache-miss pricing, output pricing, and the current V3.2 endpoint mapping.