Side-by-side benchmark comparison across knowledge, coding, math, and reasoning.
Both models are tied with an overall score of 68.
GLM-5
85
o3-pro
87.3
GLM-5
80
o3-pro
80
GLM-5
87
o3-pro
89
GLM-5
83
o3-pro
85
GLM-5 and o3-pro are tied with identical overall scores of 68.
o3-pro leads in knowledge tasks with an average score of 87.3 vs 85.
GLM-5 and o3-pro are tied for coding with average scores of 80.
o3-pro leads in math with an average score of 89 vs 87.
o3-pro leads in reasoning with an average score of 85 vs 83.