All Google Gemini and Gemma models ranked by benchmark performance.
Unless noted otherwise, ranking surfaces on this page use BenchLM's provisional leaderboard lane rather than the stricter sourced-only verified leaderboard.
Bottom line: Gemini 3.1 Pro is Google's best — the top non-reasoning model on BenchLM. Gemini 3 Pro Deep Think adds reasoning capability. Flash variants offer strong value.
According to BenchLM.ai, Gemini 3.1 Pro leads this ranking with a score of 93, followed by Gemini 3 Pro Deep Think (86) and Gemini 3 Pro (83). There is a significant gap between the leading models and the rest of the field.
The best open-weight option is Gemma 4 31B (ranked #6 with a score of 66). While proprietary models lead, open-weight options are within striking distance for teams willing to trade a few points of performance for full model control.
This ranking is based on provisional overall weighted scores across BenchLM.ai's scoring formula tracked by BenchLM.ai. For detailed model profiles, click any model name below. To compare two specific models head-to-head, use the "vs #" links.
Gemini 3.1 Pro
Google · 1M
Best Google model. Top non-reasoning overall. Strong reasoning (97) and knowledge (96).
Gemini 3 Pro Deep Think
Google · 2M
Reasoning variant with perfect multimodal. Best for complex visual reasoning.
Gemini 3 Pro
Google · 2M
Solid all-rounder. Good balance of performance and cost.
Gemini 3.1 Pro leads Google's lineup — best non-reasoning model on the leaderboard.
Gemini 3 Pro Deep Think reasoning variant with perfect multimodal and strong math (96).
Gemini 3 Pro solid all-rounder with good multimodal (86).
Best Google model overall?
Gemini 3.1 Pro — strongest across all categories
Complex reasoning tasks?
Gemini 3 Pro Deep Think — reasoning model with strong math
Budget-friendly Google?
Gemini 3.1 Flash-Lite — best value in Google's lineup
Multimodal workloads?
Gemini 3 Pro Deep Think — perfect multimodal score
Get notified when models move. One email a week with what changed and why.
Free. No spam. Unsubscribe anytime.
The top model is Gemini 3.1 Pro by Google with a provisional score of 93.
The best open-weight model is Gemma 4 31B at position #6.
12 models are included in this ranking.
Models are ranked by the same overall BenchLM score used across all leaderboards. Comparing within Google's lineup helps identify which model fits your use case and budget.
This page only shows Google models. Cross-provider comparison requires the overall or category-specific leaderboards. Newer models may have limited benchmark coverage initially.
For engineers, researchers, and the plain curious — a weekly brief on new models, ranking shifts, and pricing changes.
Free. No spam. Unsubscribe anytime.