DeepSeek LLM 2.0 vs Mistral Large 2

Side-by-side benchmark comparison across knowledge, coding, math, and reasoning.

Quick Verdict

DeepSeek LLM 2.0 wins overall with a score of 63 vs 57 (6 point difference).DeepSeek LLM 2.0 wins 4 out of 4 categories.

Knowledge

DeepSeek LLM 2.0

DeepSeek LLM 2.0

76.8

Mistral Large 2

66.5

79
MMLU
68
78
GPQA
68
76
SuperGPQA
66
74
OpenBookQA
64

Coding

DeepSeek LLM 2.0

DeepSeek LLM 2.0

73

Mistral Large 2

60

73
HumanEval
60

Mathematics

DeepSeek LLM 2.0

DeepSeek LLM 2.0

79

Mistral Large 2

67

80
AIME 2023
68
82
AIME 2024
70
81
AIME 2025
69
76
HMMT Feb 2023
64
78
HMMT Feb 2024
66
77
HMMT Feb 2025
65
79
BRUMO 2025
67

Reasoning

DeepSeek LLM 2.0

DeepSeek LLM 2.0

76

Mistral Large 2

65

77
SimpleQA
66
75
MuSR
64

Frequently Asked Questions

Which is better, DeepSeek LLM 2.0 or Mistral Large 2?

DeepSeek LLM 2.0 scores higher overall with 63 vs 57, a difference of 6 points across all benchmarks.

Which is better for knowledge tasks, DeepSeek LLM 2.0 or Mistral Large 2?

DeepSeek LLM 2.0 leads in knowledge tasks with an average score of 76.8 vs 66.5.

Which is better for coding, DeepSeek LLM 2.0 or Mistral Large 2?

DeepSeek LLM 2.0 leads in coding with an average score of 73 vs 60.

Which is better for math, DeepSeek LLM 2.0 or Mistral Large 2?

DeepSeek LLM 2.0 leads in math with an average score of 79 vs 67.

Which is better for reasoning, DeepSeek LLM 2.0 or Mistral Large 2?

DeepSeek LLM 2.0 leads in reasoning with an average score of 76 vs 65.