Best Multilingual LLMs in 2026

Top AI models ranked by multilingual benchmark performance including MGSM.

GPT-5.3 Codex
OpenAIProprietary400K

96

avg

Claude Opus 4.6
AnthropicProprietary1M

96

avg

Gemini 3.1 Pro
GoogleProprietary1M

96

avg

4
Grok 4.1
xAIProprietary128K

96

avg

5
GPT-5.4
OpenAIProprietary1M

95

avg

6
GPT-5.2
OpenAIProprietary400K

95

avg

7
Gemini 3 Pro Deep Think
GoogleProprietary2M

92

avg

8
GPT-5.2-Codex
OpenAIProprietary400K

91

avg

9
Claude Sonnet 4.6
AnthropicProprietary1M

91

avg

10
Claude Sonnet 4.5
AnthropicProprietary1M

91

avg

11
Qwen3.5 397B (Reasoning)
AlibabaOpen Weight128K

91

avg

12
Claude Opus 4.5
AnthropicProprietary200K

90

avg

13
o1-preview
OpenAIProprietary200K

90

avg

14
GPT-5 (medium)
OpenAIProprietary128K

90

avg

15
GPT-5.1-Codex-Max
OpenAIProprietary400K

89

avg

16
GPT-5.1
OpenAIProprietary400K

89

avg

17
GPT-5 (high)
OpenAIProprietary128K

89

avg

18
Gemini 3 Pro
GoogleProprietary2M

89

avg

19
GLM-5 (Reasoning)
Zhipu AIOpen Weight200K

89

avg

20
Grok 4.1 Fast
xAIProprietary2M

88

avg

21
Kimi K2.5 (Reasoning)
Moonshot AIOpen Weight128K

88

avg

22
DeepSeekMath V2
DeepSeekOpen Weight128K

87

avg

23
Claude 4.1 Opus
AnthropicProprietary200K

85

avg

24
Gemini 3 Flash
GoogleProprietary1M

85

avg

25
GLM-4.7-Flash
Zhipu AIOpen Weight200K

85

avg

26
Claude 3.5 Sonnet
AnthropicProprietary200K

85

avg

27
DeepSeek V3.2 (Thinking)
DeepSeekOpen Weight128K

84

avg

28
Grok 4
xAIProprietary128K

84

avg

29
GLM-5
Zhipu AIOpen Weight200K

84

avg

30
Qwen2.5-72B
AlibabaOpen Weight128K

84

avg

31
DeepSeek V3.2
DeepSeekOpen Weight128K

84

avg

32
Gemini 2.5 Pro
GoogleProprietary2M

84

avg

33
Claude 4 Sonnet
AnthropicProprietary200K

84

avg

34
MiniMax M2.5
MiniMaxProprietary128K

84

avg

35
Llama 3.1 405B
MetaOpen Weight128K

84

avg

36
Nemotron 3 Super 100B
NVIDIAOpen Weight32K

84

avg

37
o3-pro
OpenAIProprietary200K

83

avg

38
o3
OpenAIProprietary200K

83

avg

39
DeepSeek Coder 2.0
DeepSeekOpen Weight128K

83

avg

40
o4-mini (high)
OpenAIProprietary200K

83

avg

41
MiMo-V2-Flash
XiaomiOpen Weight128K

83

avg

42
Kimi K2.5
Moonshot AIOpen Weight128K

83

avg

43
GPT-5 mini
OpenAIProprietary128K

82

avg

44
Qwen3.5 397B
AlibabaOpen Weight128K

82

avg

45
DeepSeek LLM 2.0
DeepSeekOpen Weight128K

82

avg

46
Mistral Large 3
MistralOpen Weight128K

82

avg

47
Claude Haiku 4.5
AnthropicProprietary200K

82

avg

48
GPT-4o
OpenAIProprietary128K

82

avg

49
GLM-4.7
Zhipu AIOpen Weight200K

81

avg

50
Qwen2.5-1M
AlibabaOpen Weight1M

81

avg

51
Nemotron 3 Ultra 500B
NVIDIAOpen Weight32K

81

avg

52
Mistral Large 2
MistralProprietary128K

81

avg

53
Gemini 1.5 Pro
GoogleProprietary2M

76

avg

54
Grok Code Fast 1
xAIProprietary256K

75

avg

55
GPT-4 Turbo
OpenAIProprietary128K

75

avg

56
Nemotron-4 15B
NVIDIAOpen Weight32K

75

avg

57
Nemotron 3 Nano 30B
NVIDIAOpen Weight32K

75

avg

58
Mistral 8x7B
MistralOpen Weight32K

74

avg

59
Z-1
ZProprietary128K

74

avg

60
Nemotron Ultra 253B
NVIDIAOpen Weight32K

74

avg

61
Gemini 2.5 Flash
GoogleProprietary1M

74

avg

62
Gemini 3.1 Flash-Lite
GoogleProprietary1M

73

avg

63
Claude 3 Opus
AnthropicProprietary200K

73

avg

64
Claude 3 Haiku
AnthropicProprietary200K

73

avg

65
Moonshot v1
Moonshot AIProprietary128K

73

avg

66
Gemini 1.0 Pro
GoogleProprietary32K

72

avg

67
Llama 3 70B
MetaOpen Weight128K

72

avg

68
GPT-OSS 120B
OpenAIOpen Weight128K

72

avg

69
Llama 4 Behemoth
MetaOpen Weight32K

66

avg

70
Gemma 3 27B
GoogleOpen Weight32K

64

avg

71
DeepSeek V3.1 (Reasoning)
DeepSeekOpen Weight128K

64

avg

72
DeepSeek V3.1
DeepSeekOpen Weight128K

64

avg

73
Llama 4 Scout
MetaOpen Weight32K

63

avg

74
Llama 4 Maverick
MetaOpen Weight32K

63

avg

75
Qwen2.5-VL-32B
AlibabaOpen Weight32K

63

avg

76
Qwen3 235B 2507
AlibabaOpen Weight128K

63

avg

77
MiniMax M1 80k
MiniMaxProprietary80K

63

avg

78
GLM-4.5-Air
TsinghuaProprietary128K

63

avg

79
Qwen3 235B 2507 (Reasoning)
AlibabaOpen Weight128K

62

avg

80
Mistral 7B v0.3
MistralOpen Weight32K

62

avg

81
Mistral 8x7B v0.2
MistralOpen Weight32K

62

avg

82
DeepSeek-R1
DeepSeekOpen Weight128K

61

avg

83
Nova Pro
Nova AIProprietary128K

61

avg

84
Kimi K2
Moonshot AIProprietary128K

61

avg

85
GPT-OSS 20B
OpenAIOpen Weight128K

61

avg

86
Grok 3 [Beta]
xAIProprietary128K

60

avg

87
Claude 4.1 Opus Thinking
AnthropicProprietary200K

60

avg

88
GLM-4.5
TsinghuaProprietary128K

60

avg

Key Takeaways

  • The top model is GPT-5.3 Codex by OpenAI with a score of 96.
  • The best open-weight model in this ranking is Qwen3.5 397B (Reasoning) at position #11.
  • 88 models are included in this ranking.