Model comparison

Claude Fable 5 vs Claude Opus 4.8

Data verified July 24, 2026

Head-to-head evidence from 32 shared benchmark results across 6 categories. Overall scores shown here use the public BenchAlign v5 ranking lane.

Claude Fable 5

Anthropic

82.76/100

Margin

5.3pts

← winning

Claude Opus 4.8

Anthropic

77.44/100

2 category wins1 category wins

Public leaderboard positions: Claude Fable 5 #3 (Supported); Claude Opus 4.8 #6 (Supported). Intervals and evidence labels describe ranking uncertainty, not a guarantee for a specific workload.

Evidence parity. Claude Fable 5 and Claude Opus 4.8 share 32 comparable benchmark results. 3 of 8 categories are comparable. 2 results are unique to Claude Fable 5; 23 to Claude Opus 4.8.

Updated July 24, 2026

Shared results: 32
Claude Fable 5 only: 2
Claude Opus 4.8 only: 23
Comparable categories: 3 / 8

Pick Claude Fable 5 if you want the stronger benchmark profile. Claude Opus 4.8 only becomes the better choice if multimodal & grounded is the priority or you want the cheaper token bill.

Confidence note. This is a partial-evidence comparison with 32 shared benchmark results across 6 evidence categories; 3 of 8 categories currently have scoreable aggregates for both models. Treat the verdict as directional until coverage is more balanced.

Why this result

Claude Fable 5 is clearly ahead on the BenchAlign aggregate, 82.76 to 77.44. The gap is large enough that you do not need to squint at the spreadsheet to see the difference.

Claude Fable 5's sharpest advantage is in coding, where it averages 89.2 against 81.1. The single biggest benchmark swing on the page is SWE-bench Pro, 80% to 69.2%. Claude Opus 4.8 does hit back in multimodal & grounded, so the answer changes if that is the part of the workload you care about most.

Claude Fable 5 is also the more expensive model on tokens at $10.00 input / $50.00 output per 1M tokens, versus $5.00 input / $25.00 output per 1M tokens for Claude Opus 4.8. That is roughly 2.0x on output cost alone. Claude Fable 5 gives you the larger context window at 1M+, compared with 1M for Claude Opus 4.8.

Category breakdown

Exact category averages are shown below. Not measured means BenchLM does not have enough sourced public coverage for that model and category.

Category scores and score margins for Claude Fable 5 and Claude Opus 4.8
Category	Claude Fable 5	Δ	Claude Opus 4.8
Multimodal	Claude Fable 557.9	Margin→ 19.1	Claude Opus 4.877.0
Coding	Claude Fable 589.2	Margin← 8.1	Claude Opus 4.881.1
Agentic	Claude Fable 584.6	Margin← 4.3	Claude Opus 4.880.3
Reasoning	Claude Fable 5Not measured	MarginNo overlap	Claude Opus 4.872.1
Knowledge	Claude Fable 5Not measured	MarginNo overlap	Claude Opus 4.862.7
Math	Claude Fable 5Not measured	MarginNo overlap	Claude Opus 4.853.9

Decisive benchmark drivers

The largest measured benchmark gaps in this matchup, with exact reported values.

A · Claude Fable 5B · Claude Opus 4.8

SWE-bench Pro
Coding
Source ↗
A 80%B 69.2%
Winner: Claude Fable 5Δ 10.8
SWE-bench Pro: Claude Fable 5 scored 80%; Claude Opus 4.8 scored 69.2%. Claude Fable 5 wins this benchmark.
Terminal-Bench 2.0
Agentic
Source ↗
A 84.3%B 74.6%
Winner: Claude Fable 5Δ 9.7
Terminal-Bench 2.0: Claude Fable 5 scored 84.3%; Claude Opus 4.8 scored 74.6%. Claude Fable 5 wins this benchmark.
OfficeQA Pro
Multimodal
Source ↗
A 57.9%B 66.2%
Winner: Claude Opus 4.8Δ 8.3
OfficeQA Pro: Claude Fable 5 scored 57.9%; Claude Opus 4.8 scored 66.2%. Claude Opus 4.8 wins this benchmark.
SWE-bench Verified
Coding
Source ↗
A 95%B 88.6%
Winner: Claude Fable 5Δ 6.4
SWE-bench Verified: Claude Fable 5 scored 95%; Claude Opus 4.8 scored 88.6%. Claude Fable 5 wins this benchmark.
OSWorld-Verified
Agentic
Source ↗
A 85%B 83.4%
Winner: Claude Fable 5Δ 1.6
OSWorld-Verified: Claude Fable 5 scored 85%; Claude Opus 4.8 scored 83.4%. Claude Fable 5 wins this benchmark.

Operational comparison

Runtime and commercial metrics are compared only when both models have a complete sourced value.

Metric	Claude Fable 5	Claude Opus 4.8	Comparison
Input / output priceUSD per 1M tokens	Claude Fable 5$10 input / $50 output	Claude Opus 4.8$5 input / $25 output	Claude Opus 4.8 has the lower combined listed price.
Generation speedtokens per second	Claude Fable 5Not available	Claude Opus 4.8Not available	A complete speed comparison is not available.
First-answer latencyseconds to first token	Claude Fable 5Not available	Claude Opus 4.8Not available	A complete latency comparison is not available.
Context windowmaximum listed tokens	Claude Fable 51M+	Claude Opus 4.81M	Claude Fable 5 lists the larger context window.

Benchmark Deep Dive

AgenticClaude Fable 5 wins

21 benchmarks

Benchmark	Claude Fable 5	Claude Opus 4.8	Result
Terminal-Bench 2.0Source	84.3%	74.6%	Claude Fable 5 leads
OSWorld-VerifiedSource	85%	83.4%	Claude Fable 5 leads
GDPval-AASource	1747	1593	Claude Fable 5 leads
AA Agentic IndexSource	52.8%	47.2%	Claude Fable 5 leads
τ²-bench resultsSource	98.5%	94.4%	Claude Fable 5 leads
GDPval-AASource	62.3%	54.6%	Claude Fable 5 leads
AA BriefcaseSource	1574	1346	Claude Fable 5 leads
AA AutomationBenchSource	48.6%	48.5%	Claude Fable 5 leads
AA EnterpriseOps-GymSource	51.1%	44.0%	Claude Fable 5 leads
AA Harvey LABSource	93.6%	91.1%	Claude Fable 5 leads
AA Tau3 BankingSource	26.8%	27.6%	Claude Opus 4.8 leads
terminalBenchHardSource	62.9%	58.3%	Claude Fable 5 leads
aaTerminalBench21Source	84.6%	84.6%	Tie
BrowseCompSource	—	84.3%	Not comparable
DeepSearchQASource	—	93.1%	Not comparable
Finance Agent v2Source	—	53.9%	Not comparable
MCP AtlasSource	—	82.2%	Not comparable
ToolathlonSource	—	59.9%	Not comparable
Gert LabsSource	—	72.97%	Not comparable
ResearchClawBenchSource	—	21.1%	Not comparable
OSWorld 2.0Source	—	20.6%	Not comparable

CodingClaude Fable 5 wins

11 benchmarks

Benchmark	Claude Fable 5	Claude Opus 4.8	Result
SWE-bench VerifiedSource	95%	88.6%	Claude Fable 5 leads
SWE-bench ProSource	80%	69.2%	Claude Fable 5 leads
FrontierCode 1.1 MainSource	53.5%	46.5%	Claude Fable 5 leads
Terminal-Bench 2.0Source	84.3%	74.6%	Claude Fable 5 leads
cursorBench31Source	70.6%	58.4%	Claude Fable 5 leads
cursorBench32Source	70.5%	62.3%	Claude Fable 5 leads
VulcanBench v3Source	87.0%	—	Not comparable
AA Coding IndexSource	76.5%	74.3%	Claude Fable 5 leads
AA-SciCodeSource	60.2%	53.5%	Claude Fable 5 leads
SWE MultilingualSource	—	84.4%	Not comparable
SWE MultimodalSource	—	38.4%	Not comparable

Reasoning

4 benchmarks

Benchmark	Claude Fable 5	Claude Opus 4.8	Result
AA-LCRSource	70.0%	67.7%	Claude Fable 5 leads
CritPtSource	28.6%	20.9%	Claude Fable 5 leads
ARC-AGI-2Source	—	72.1%	Not comparable
ARC-AGI-3Source	—	1.5%	Not comparable

Knowledge

10 benchmarks

Benchmark	Claude Fable 5	Claude Opus 4.8	Result
Artificial Analysis Intelligence IndexSource	59.9%	55.7%	Claude Fable 5 leads
AA-GPQA DiamondSource	92.6%	92.0%	Claude Fable 5 leads
AA-HLESource	53.3%	45.7%	Claude Fable 5 leads
AA-Omniscience IndexSource	40.2%	27.4%	Claude Fable 5 leads
AA-Omniscience AccuracySource	61.4%	46.6%	Claude Fable 5 leads
AA-Omniscience Hallucination RateSource	54.9%	35.9%	Claude Opus 4.8 leads
GPQASource	—	93.6%	Not comparable
GPQA-DSource	—	93.6%	Not comparable
HLESource	—	57.9%	Not comparable
HLE w/o toolsSource	—	49.8%	Not comparable

Math

3 benchmarks

Benchmark	Claude Fable 5	Claude Opus 4.8	Result
USAMO 2026Source	—	96.7%	Not comparable
FrontierMath v2 (Tiers 1-3)Source	—	47.241%	Not comparable
FrontierMath v2 (Tier 4)Source	—	31.250%	Not comparable

Multilingual

1 benchmarks

Benchmark	Claude Fable 5	Claude Opus 4.8	Result
INCLUDESource	—	87.6%	Not comparable

MultimodalClaude Opus 4.8 wins

6 benchmarks

Benchmark	Claude Fable 5	Claude Opus 4.8	Result
Blueprint-Bench 2Source	38.6%	—	Not comparable
OfficeQA ProSource	57.9%	66.2%	Claude Opus 4.8 leads
Design Arena WebsiteSource	1332	1270	Claude Fable 5 leads
ScreenSpot ProSource	—	87.9%	Not comparable
CharXivSource	—	89.9%	Not comparable
CharXiv w/o toolsSource	—	80.5%	Not comparable

Inst. Following

1 benchmarks

Benchmark	Claude Fable 5	Claude Opus 4.8	Result
AA-IFBenchSource	63.5%	62.2%	Claude Fable 5 leads

Frequently Asked Questions (4)

Which is better, Claude Fable 5 or Claude Opus 4.8?

Claude Fable 5 is ahead on BenchLM's BenchAlign leaderboard, 82.76 to 77.44. The biggest single separator in this matchup is SWE-bench Pro, where the scores are 80% and 69.2%.

Which is better for coding, Claude Fable 5 or Claude Opus 4.8?

Claude Fable 5 has the edge for coding in this comparison, averaging 89.2 versus 81.1. Inside this category, cursorBench31 is the benchmark that creates the most daylight between them.

Which is better for agentic tasks, Claude Fable 5 or Claude Opus 4.8?

Claude Fable 5 has the edge for agentic tasks in this comparison, averaging 84.6 versus 80.3. Inside this category, AA Briefcase is the benchmark that creates the most daylight between them.

Which is better for multimodal and grounded tasks, Claude Fable 5 or Claude Opus 4.8?

Claude Opus 4.8 has the edge for multimodal and grounded tasks in this comparison, averaging 77 versus 57.9. Inside this category, Design Arena Website is the benchmark that creates the most daylight between them.

Related Comparisons

Explore More

Anthropic Compare Pricing Methodology Find Your Best LLM Overall Rankings

Last updated: July 24, 2026

Claude Fable 5 vs Claude Opus 4.8

Category breakdown

Decisive benchmark drivers

SWE-bench Pro

Terminal-Bench 2.0

OfficeQA Pro

SWE-bench Verified

OSWorld-Verified