GLM-4.5-Air vs Kimi K2

Side-by-side benchmark comparison across knowledge, coding, math, and reasoning.

Quick Verdict

GLM-4.5-Air wins overall with a score of 26 vs 23 (3 point difference).GLM-4.5-Air wins 4 out of 4 categories.

Knowledge

GLM-4.5-Air

GLM-4.5-Air

32.8

Kimi K2

29.8

35
MMLU
32
34
GPQA
31
32
SuperGPQA
29
30
OpenBookQA
27

Coding

GLM-4.5-Air

GLM-4.5-Air

27

Kimi K2

24

27
HumanEval
24

Mathematics

GLM-4.5-Air

GLM-4.5-Air

34

Kimi K2

31

35
AIME 2023
32
37
AIME 2024
34
36
AIME 2025
33
31
HMMT Feb 2023
28
33
HMMT Feb 2024
30
32
HMMT Feb 2025
29
34
BRUMO 2025
31

Reasoning

GLM-4.5-Air

GLM-4.5-Air

32

Kimi K2

29

33
SimpleQA
30
31
MuSR
28

Frequently Asked Questions

Which is better, GLM-4.5-Air or Kimi K2?

GLM-4.5-Air scores higher overall with 26 vs 23, a difference of 3 points across all benchmarks.

Which is better for knowledge tasks, GLM-4.5-Air or Kimi K2?

GLM-4.5-Air leads in knowledge tasks with an average score of 32.8 vs 29.8.

Which is better for coding, GLM-4.5-Air or Kimi K2?

GLM-4.5-Air leads in coding with an average score of 27 vs 24.

Which is better for math, GLM-4.5-Air or Kimi K2?

GLM-4.5-Air leads in math with an average score of 34 vs 31.

Which is better for reasoning, GLM-4.5-Air or Kimi K2?

GLM-4.5-Air leads in reasoning with an average score of 32 vs 29.