Skip to main content
pricinggeminigoogleapicostguidefree tier

Gemini API Pricing: Current Flash, Flash-Lite, and Pro Rates (April 2026)

Current Gemini API pricing from Google's official docs: 3.1 Pro Preview, 3.1 Flash-Lite Preview, 3 Flash Preview, 2.5 Flash, 2.5 Pro, plus Batch and Flex pricing.

Glevd·Published April 13, 2026·9 min read

Share This Report

Copy the link, post it, or save a PDF version.

Share on XShare on LinkedIn

Google's Gemini pricing is more nuanced than a single provider table can capture. The current official docs split pricing by model, service tier (Standard, Batch, Flex, Priority), and in some cases prompt size. That is why older one-line summaries often get Gemini wrong.

This guide uses the current official Gemini pricing page and Gemini rate-limit page.

The key pricing rows to know

Gemini 3.1 Pro Preview

Tier Input $/M Output $/M Notes
Standard $2.00 <= 200K prompt, $4.00 > 200K $12.00 <= 200K, $18.00 > 200K No free tier listed
Batch $1.00 <= 200K, $2.00 > 200K $6.00 <= 200K, $9.00 > 200K Half-price async tier
Flex $1.00 <= 200K, $2.00 > 200K $6.00 <= 200K, $9.00 > 200K Lower-priority tier

Gemini 3.1 Flash-Lite Preview

Tier Input $/M Output $/M
Standard $0.25 $1.50
Batch $0.125 $0.75
Flex $0.125 $0.75

Google explicitly describes Flash-Lite as its most cost-efficient model and notes that preview models may change before becoming stable.

Gemini 3 Flash Preview

Tier Input $/M Output $/M
Standard $0.50 $3.00
Batch $0.25 $1.50
Flex $0.25 $1.50

Gemini 2.5 Flash

Tier Input $/M Output $/M
Standard $0.30 $2.50
Batch $0.15 $1.25
Flex $0.15 $1.25

Gemini 2.5 Pro

Tier Input $/M Output $/M Notes
Standard $1.25 <= 200K prompt, $2.50 > 200K $10.00 <= 200K, $15.00 > 200K Free tier listed
Batch $0.625 <= 200K, $1.25 > 200K $5.00 <= 200K, $7.50 > 200K Half-price async tier
Flex $0.625 <= 200K, $1.25 > 200K $5.00 <= 200K, $7.50 > 200K Lower-priority tier

What changed versus the stale summaries still circulating

Three things matter:

  1. Gemini 3.1 Pro Preview is not a $1.25/$5 model. The current official page lists it materially higher.
  2. Gemini 2.5 Pro pricing changes above 200K prompt tokens. If you ignore that threshold, you under-estimate cost.
  3. Google does not publish one universal free-tier RPM number. The rate-limit docs say limits vary by model and usage tier, preview models are more restricted, and your live limits should be checked in AI Studio.

Real cost example: one workload across cheap, mid, and premium Gemini tiers

Assume 5,000 requests per day, each with 2,500 input tokens and 400 output tokens.

Gemini 3.1 Flash-Lite Preview Standard

  • Daily input: 5,000 x 2,500 x $0.25 / 1M = $3.125
  • Daily output: 5,000 x 400 x $1.50 / 1M = $3.00
  • Monthly total: about $183.75

Gemini 2.5 Flash Standard

  • Daily input: 5,000 x 2,500 x $0.30 / 1M = $3.75
  • Daily output: 5,000 x 400 x $2.50 / 1M = $5.00
  • Monthly total: about $262.50

Gemini 2.5 Pro Standard

This prompt is well under 200K tokens, so the lower Pro bracket applies:

  • Daily input: 5,000 x 2,500 x $1.25 / 1M = $15.625
  • Daily output: 5,000 x 400 x $10.00 / 1M = $20.00
  • Monthly total: about $1,068.75

Gemini 3.1 Pro Preview Standard

  • Daily input: 5,000 x 2,500 x $2.00 / 1M = $25.00
  • Daily output: 5,000 x 400 x $12.00 / 1M = $24.00
  • Monthly total: about $1,470

That is the real current shape of Gemini pricing: cheap Flash tiers, a middle 2.5 Pro tier, and a more expensive 3.1 Pro Preview tier.

When to use which Gemini tier

  • Use Gemini 3.1 Flash-Lite Preview when minimizing cost matters most.
  • Use Gemini 2.5 Flash when you want a cheap general-purpose workhorse without preview-only positioning.
  • Use Gemini 2.5 Pro when you need Google's stronger reasoning and coding tier without jumping to 3.1 Pro Preview pricing.
  • Use Gemini 3.1 Pro Preview when you specifically want the newest Pro-preview model and are willing to pay for it.

Rate-limit advice that is actually correct

Google's current rate-limit docs say:

  • limits are measured across RPM, TPM, and RPD
  • limits vary by model and usage tier
  • preview models are more restricted
  • your active limits should be checked in AI Studio

So do not hard-code old blog-post numbers like "Flash is 15 RPM and Pro is 2 RPM" into planning docs. That is not how Google documents Gemini limits today.

The practical takeaway

For Gemini, cost control is mostly about tier selection and service tier selection:

  • Pick the right model.
  • Use Batch or Flex when latency is not critical.
  • Watch the 200K prompt threshold on Pro models.
  • Verify current quotas in AI Studio, not in screenshots or stale blog posts.

For a multi-provider comparison, see our LLM pricing overview.

Model pricing changes frequently. We send one email a week with what moved and why.