Best Non-Reasoning LLMs in 2026

Top standard AI models (no chain-of-thought reasoning) ranked by benchmark performance. Faster and cheaper than reasoning models.

Unless noted otherwise, ranking surfaces on this page use BenchLM's provisional leaderboard lane rather than the stricter sourced-only verified leaderboard.

Bottom line: Non-reasoning models are faster and cheaper than chain-of-thought alternatives. Gemini 3.1 Pro leads this tier — proving that strong reasoning scores are possible without dedicated thinking tokens.

According to BenchLM.ai, Gemini 3.1 Pro leads this ranking with a score of 92, followed by Grok 4.1 (90) and Claude Opus 4.6 (87). There is meaningful separation between the top models, suggesting genuine performance differences.

The best open-weight option is MiniMax M3 (ranked #7 with a score of 76). While proprietary models lead, open-weight options are within striking distance for teams willing to trade a few points of performance for full model control.

This ranking is based on provisional overall weighted scores across BenchLM.ai's scoring formula tracked by BenchLM.ai. For detailed model profiles, click any model name below. To compare two specific models head-to-head, use the "vs #" links.

1Closed

Gemini 3.1 Pro

Google · 1M

92prov. overall

Best non-reasoning model. Leads reasoning, knowledge, and multilingual.

2Closed

Grok 4.1

xAI · 1M

90prov. overall

3Closed

Claude Opus 4.6

Anthropic · 1M

87prov. overall

Most balanced non-reasoning model. Top instruction following (95).

What changed

Gemini 3.1 Pro leads non-reasoning models — best reasoning (97), knowledge (96), and multilingual (100).

Claude Opus 4.6 most consistent non-reasoning model across all 8 categories.

Claude Sonnet 4.6 strong mid-tier with best multimodal (95) in this tier.

How to choose

Best non-reasoning model?

Gemini 3.1 Pro — strongest across all categories

Production reliability?

Claude Opus 4.6 — most consistent in this tier

Lower latency and cost?

Non-reasoning models skip chain-of-thought — all are faster than reasoning alternatives

Compare with reasoning models?

See reasoning models to evaluate the accuracy-speed trade-off

Full Rankings (69 models)

Gemini 3.1 Pro

Google·Proprietary·1M

prov. overall

vs #2

Grok 4.1

xAI·Proprietary·1M

prov. overall

vs #3

Claude Opus 4.6

Anthropic·Proprietary·1M

prov. overall

vs #4

Claude Sonnet 4.6

Anthropic·Proprietary·200K

prov. overall

vs #5

Gemini 3 Pro

Google·Proprietary·2M

prov. overall

vs #6

Claude Opus 4.5

Anthropic·Proprietary·200K

prov. overall

vs #7

MiniMax M3

MiniMax·Open Weight·1M

prov. overall

vs #8

DeepSeek V4 Pro

DeepSeek·Open Weight·1M

prov. overall

vs #9

Grok 4.1 Fast

xAI·Proprietary·1M

prov. overall

vs #10

GLM-5

Z.AI·Open Weight·200K

prov. overall

vs #11

Claude Sonnet 4.5

Anthropic·Proprietary·200K

prov. overall

vs #12

Kimi K2.5

Moonshot AI·Open Weight·256K

prov. overall

vs #13

Gemini 2.5 Pro

Google·Proprietary·1M

prov. overall

vs #14

Qwen3.5 397B

Alibaba·Open Weight·128K

prov. overall

vs #15

Grok 4

xAI·Proprietary·128K

prov. overall

vs #16

DeepSeek V4 Flash

DeepSeek·Open Weight·1M

prov. overall

vs #17

DeepSeek V3.2

DeepSeek·Open Weight·128K

prov. overall

vs #18

GPT-4.1

OpenAI·Proprietary·1M

prov. overall

vs #19

Claude Haiku 4.5

Anthropic·Proprietary·200K

prov. overall

vs #20

Gemini 3 Flash

Google·Proprietary·1M

prov. overall

vs #21

MiniMax M2.7

MiniMax·Open Weight·200K

prov. overall

vs #22

Claude 4.1 Opus

Anthropic·Proprietary·200K

prov. overall

vs #23

DeepSeek Coder 2.0

DeepSeek·Open Weight·128K

prov. overall

vs #24

DeepSeek LLM 2.0

DeepSeek·Open Weight·128K

prov. overall

vs #25

Qwen2.5-1M

Alibaba·Open Weight·1M

prov. overall

vs #26

Claude 4 Sonnet

Anthropic·Proprietary·200K

prov. overall

vs #27

GPT-4o mini

OpenAI·Proprietary·128K

prov. overall

vs #28

Mistral Large 3

Mistral·Proprietary·128K

prov. overall

vs #29

Qwen2.5-72B

Alibaba·Open Weight·128K

prov. overall

vs #30

Gemini 3.1 Flash-Lite

Google·Proprietary·1M

prov. overall

vs #31

GPT-4.1 mini

OpenAI·Proprietary·1M

prov. overall

vs #32

Nemotron 3 Super 100B

NVIDIA·Open Weight·1M

prov. overall

vs #33

GPT-4o

OpenAI·Proprietary·128K

prov. overall

vs #34

Kimi K2

Moonshot AI·Proprietary·128K

prov. overall

vs #35

Llama 3.1 405B

Meta·Open Weight·128K

prov. overall

vs #36

Claude 3.5 Sonnet

Anthropic·Proprietary·200K

prov. overall

vs #37

Grok Code Fast 1

xAI·Proprietary·256K

prov. overall

vs #38

Mistral Large 2

Mistral·Proprietary·128K

prov. overall

vs #39

Gemini 2.5 Flash

Google·Proprietary·1M

prov. overall

vs #40

DeepSeek V3

DeepSeek·Open Weight·128K

prov. overall

vs #41

Gemini 1.5 Pro

Google·Proprietary·2M

prov. overall

vs #42

Claude 3 Opus

Anthropic·Proprietary·200K

prov. overall

vs #43

GPT-OSS 120B

OpenAI·Open Weight·128K

prov. overall

vs #44

DBRX Instruct

Databricks·Open Weight·32K

prov. overall

vs #45

Qwen3 235B 2507

Alibaba·Open Weight·128K

prov. overall

vs #46

Grok 3 [Beta]

xAI·Proprietary·128K

prov. overall

vs #47

Phi-4

Microsoft·Open Weight·16K

prov. overall

vs #48

GPT-4.1 nano

OpenAI·Proprietary·1M

prov. overall

vs #49

GLM-4.5

Z.AI·Proprietary·128K

prov. overall

vs #50

Llama 3 70B

Meta·Open Weight·128K

prov. overall

vs #51

DeepSeek V3.1

DeepSeek·Open Weight·128K

prov. overall

vs #52

GPT-4 Turbo

OpenAI·Proprietary·128K

prov. overall

vs #53

Nemotron 3 Nano 30B

NVIDIA·Open Weight·32K

prov. overall

vs #54

Gemini 1.0 Pro

Google·Proprietary·32K

prov. overall

vs #55

Mistral 8x7B

Mistral·Open Weight·32K

prov. overall

vs #56

Z-1

Z·Proprietary·128K

prov. overall

vs #57

Claude 3 Haiku

Anthropic·Proprietary·200K

prov. overall

vs #58

Moonshot v1

Moonshot AI·Proprietary·128K

prov. overall

vs #59

Llama 4 Scout

Meta·Open Weight·10M

prov. overall

vs #60

Mixtral 8x22B Instruct v0.1

Mistral·Open Weight·64K

prov. overall

vs #61

Nemotron-4 15B

NVIDIA·Open Weight·32K

prov. overall

vs #62

GLM-4.5-Air

Z.AI·Proprietary·128K

prov. overall

vs #63

Gemma 3 27B

Google·Open Weight·32K

prov. overall

vs #64

GPT-OSS 20B

OpenAI·Open Weight·128K

prov. overall

vs #65

Llama 4 Maverick

Meta·Open Weight·1M

prov. overall

vs #66

Llama 4 Behemoth

Meta·Open Weight·32K

prov. overall

vs #67

Nova Pro

Amazon·Proprietary·128K

prov. overall

vs #68

Mistral 7B v0.3

Mistral·Open Weight·32K

prov. overall

vs #69

Mistral 8x7B v0.2

Mistral·Open Weight·32K

prov. overall

These rankings update weekly

Get notified when models move. One email a week with what changed and why.

Free. No spam. Unsubscribe anytime.

Key Takeaways

The top model is Gemini 3.1 Pro by Google with a provisional score of 92.

The best open-weight model is MiniMax M3 at position #7.

69 models are included in this ranking.

Score in Context

What these scores mean

Non-reasoning models are standard completion/chat models without dedicated chain-of-thought. They are ranked by the same overall BenchLM score and are typically faster and cheaper per token.

Known limitations

The "non-reasoning" label excludes models with explicit chain-of-thought (like o3, DeepSeek R1). Some non-reasoning models still reason internally — the distinction is about architecture and pricing, not capability.

Explore More

Price vs Performance Chart Compare Pricing Which LLM Should I Use? Benchmark Explainers

Last updated: June 2, 2026

The AI models change fast. We track them for you.

For engineers, researchers, and the plain curious — a weekly brief on new models, ranking shifts, and pricing changes.

Free. No spam. Unsubscribe anytime.