Side-by-side benchmark comparison across knowledge, coding, math, and reasoning.
Both models are tied with an overall score of 61.
Claude 4.1 Opus
74.5
Mistral Large 3
73.8
Claude 4.1 Opus
68
Mistral Large 3
68
Claude 4.1 Opus
75
Mistral Large 3
75
Claude 4.1 Opus
73
Mistral Large 3
72
Claude 4.1 Opus and Mistral Large 3 are tied with identical overall scores of 61.
Claude 4.1 Opus leads in knowledge tasks with an average score of 74.5 vs 73.8.
Claude 4.1 Opus and Mistral Large 3 are tied for coding with average scores of 68.
Claude 4.1 Opus and Mistral Large 3 are tied for math with average scores of 75.
Claude 4.1 Opus leads in reasoning with an average score of 73 vs 72.