Best LLMs for Math in 2026

Top AI models ranked by mathematics benchmark performance including AIME, HMMT, and BRUMO.

GPT-5.4
OpenAIProprietary1M

97.1

avg

Gemini 3.1 Pro
GoogleProprietary1M

97.1

avg

Claude Opus 4.6
AnthropicProprietary1M

97.1

avg

4
GPT-5.3 Codex
OpenAIProprietary400K

97.1

avg

5
Grok 4.1
xAIProprietary128K

97.1

avg

6
GPT-5.2
OpenAIProprietary400K

97.1

avg

7
GPT-5.2-Codex
OpenAIProprietary400K

97.1

avg

8
Gemini 3 Pro Deep Think
GoogleProprietary2M

97.1

avg

9
Claude Sonnet 4.6
AnthropicProprietary1M

97.1

avg

10
Claude Opus 4.5
AnthropicProprietary200K

97.1

avg

11
Gemini 3 Pro
GoogleProprietary2M

97.1

avg

12
GPT-5.1-Codex-Max
OpenAIProprietary400K

97.1

avg

13
GPT-5.1
OpenAIProprietary400K

97.1

avg

14
GLM-5 (Reasoning)
Zhipu AIOpen Weight200K

96.6

avg

15
Claude Sonnet 4.5
AnthropicProprietary1M

96

avg

16
Grok 4.1 Fast
xAIProprietary2M

95

avg

17
GPT-5 (high)
OpenAIProprietary128K

94

avg

18
o1-preview
OpenAIProprietary200K

93

avg

19
Kimi K2.5 (Reasoning)
Moonshot AIOpen Weight128K

93

avg

20
GPT-5 (medium)
OpenAIProprietary128K

92

avg

21
Qwen3.5 397B (Reasoning)
AlibabaOpen Weight128K

92

avg

22
GPT-5 mini
OpenAIProprietary128K

89

avg

23
o3-pro
OpenAIProprietary200K

89

avg

24
o3
OpenAIProprietary200K

87

avg

25
GLM-5
Zhipu AIOpen Weight200K

87

avg

26
Grok 4
xAIProprietary128K

86.6

avg

27
DeepSeek V3.2 (Thinking)
DeepSeekOpen Weight128K

86

avg

28
GLM-4.7
Zhipu AIOpen Weight200K

85

avg

29
Qwen2.5-1M
AlibabaOpen Weight1M

84

avg

30
Qwen2.5-72B
AlibabaOpen Weight128K

83

avg

31
Gemini 2.5 Pro
GoogleProprietary2M

83

avg

32
DeepSeek V3.2
DeepSeekOpen Weight128K

83

avg

33
o4-mini (high)
OpenAIProprietary200K

82

avg

34
Qwen3.5 397B
AlibabaOpen Weight128K

82

avg

35
DeepSeek Coder 2.0
DeepSeekOpen Weight128K

80

avg

36
DeepSeek LLM 2.0
DeepSeekOpen Weight128K

79

avg

37
DeepSeekMath V2
DeepSeekOpen Weight128K

79

avg

38
MiMo-V2-Flash
XiaomiOpen Weight128K

78

avg

39
Kimi K2.5
Moonshot AIOpen Weight128K

76

avg

40
Claude 4.1 Opus
AnthropicProprietary200K

75

avg

41
Mistral Large 3
MistralOpen Weight128K

75

avg

42
Nemotron 3 Ultra 500B
NVIDIAOpen Weight32K

73

avg

43
Claude 4 Sonnet
AnthropicProprietary200K

72

avg

44
MiniMax M2.5
MiniMaxProprietary128K

72

avg

45
Llama 3.1 405B
MetaOpen Weight128K

69

avg

46
Gemini 3 Flash
GoogleProprietary1M

69

avg

47
Mistral Large 2
MistralProprietary128K

67

avg

48
Claude Haiku 4.5
AnthropicProprietary200K

67

avg

49
GPT-4o
OpenAIProprietary128K

65

avg

50
GLM-4.7-Flash
Zhipu AIOpen Weight200K

65

avg

51
Mistral 8x7B
MistralOpen Weight32K

64

avg

52
Claude 3.5 Sonnet
AnthropicProprietary200K

64

avg

53
Nemotron 3 Super 100B
NVIDIAOpen Weight32K

64

avg

54
Gemini 1.5 Pro
GoogleProprietary2M

63

avg

55
Grok Code Fast 1
xAIProprietary256K

63

avg

56
Gemini 3.1 Flash-Lite
GoogleProprietary1M

62

avg

57
Gemini 1.0 Pro
GoogleProprietary32K

61

avg

58
Claude 3 Opus
AnthropicProprietary200K

60

avg

59
GPT-4 Turbo
OpenAIProprietary128K

59

avg

60
Llama 3 70B
MetaOpen Weight128K

57

avg

61
Nemotron 3 Nano 30B
NVIDIAOpen Weight32K

56

avg

62
Claude 3 Haiku
AnthropicProprietary200K

55

avg

63
Nemotron-4 15B
NVIDIAOpen Weight32K

53

avg

64
Moonshot v1
Moonshot AIProprietary128K

52

avg

65
Z-1
ZProprietary128K

51

avg

66
GPT-OSS 120B
OpenAIOpen Weight128K

50

avg

67
Gemini 2.5 Flash
GoogleProprietary1M

49

avg

68
Nemotron Ultra 253B
NVIDIAOpen Weight32K

48

avg

69
Llama 4 Behemoth
MetaOpen Weight32K

47

avg

70
Llama 4 Scout
MetaOpen Weight32K

46

avg

71
Llama 4 Maverick
MetaOpen Weight32K

45

avg

72
Gemma 3 27B
GoogleOpen Weight32K

44

avg

73
DeepSeek-R1
DeepSeekOpen Weight128K

43

avg

74
Qwen2.5-VL-32B
AlibabaOpen Weight32K

42

avg

75
Grok 3 [Beta]
xAIProprietary128K

41

avg

76
Nova Pro
Nova AIProprietary128K

40

avg

77
Qwen3 235B 2507 (Reasoning)
AlibabaOpen Weight128K

39

avg

78
Qwen3 235B 2507
AlibabaOpen Weight128K

38

avg

79
Claude 4.1 Opus Thinking
AnthropicProprietary200K

37

avg

80
GLM-4.5
TsinghuaProprietary128K

36

avg

81
MiniMax M1 80k
MiniMaxProprietary80K

35

avg

82
GLM-4.5-Air
TsinghuaProprietary128K

34

avg

83
DeepSeek V3.1 (Reasoning)
DeepSeekOpen Weight128K

33

avg

84
DeepSeek V3.1
DeepSeekOpen Weight128K

32

avg

85
Kimi K2
Moonshot AIProprietary128K

31

avg

86
GPT-OSS 20B
OpenAIOpen Weight128K

30

avg

87
Mistral 7B v0.3
MistralOpen Weight32K

29

avg

88
Mistral 8x7B v0.2
MistralOpen Weight32K

28

avg

Key Takeaways

  • The top model is GPT-5.4 by OpenAI with a score of 97.1.
  • The best open-weight model in this ranking is GLM-5 (Reasoning) at position #14.
  • 88 models are included in this ranking.