Mistral Large 3 vs Nemotron Ultra 253B

Side-by-side benchmark comparison across knowledge, coding, math, and reasoning.

Quick Verdict

Mistral Large 3 wins overall with a score of 61 vs 40 (21 point difference).Mistral Large 3 wins 4 out of 4 categories.

Knowledge

Mistral Large 3

Mistral Large 3

73.8

Nemotron Ultra 253B

46.8

76
MMLU
49
75
GPQA
48
73
SuperGPQA
46
71
OpenBookQA
44

Coding

Mistral Large 3

Mistral Large 3

68

Nemotron Ultra 253B

41

68
HumanEval
41

Mathematics

Mistral Large 3

Mistral Large 3

75

Nemotron Ultra 253B

48

76
AIME 2023
49
78
AIME 2024
51
77
AIME 2025
50
72
HMMT Feb 2023
45
74
HMMT Feb 2024
47
73
HMMT Feb 2025
46
75
BRUMO 2025
48

Reasoning

Mistral Large 3

Mistral Large 3

72

Nemotron Ultra 253B

46

73
SimpleQA
47
71
MuSR
45

Frequently Asked Questions

Which is better, Mistral Large 3 or Nemotron Ultra 253B?

Mistral Large 3 scores higher overall with 61 vs 40, a difference of 21 points across all benchmarks.

Which is better for knowledge tasks, Mistral Large 3 or Nemotron Ultra 253B?

Mistral Large 3 leads in knowledge tasks with an average score of 73.8 vs 46.8.

Which is better for coding, Mistral Large 3 or Nemotron Ultra 253B?

Mistral Large 3 leads in coding with an average score of 68 vs 41.

Which is better for math, Mistral Large 3 or Nemotron Ultra 253B?

Mistral Large 3 leads in math with an average score of 75 vs 48.

Which is better for reasoning, Mistral Large 3 or Nemotron Ultra 253B?

Mistral Large 3 leads in reasoning with an average score of 72 vs 46.