Kimi K2.5 (Reasoning) vs o1-preview

Side-by-side benchmark comparison across knowledge, coding, math, and reasoning.

Quick Verdict

Both models are tied with an overall score of 71.

Knowledge

Tie

Kimi K2.5 (Reasoning)

89

o1-preview

89

92
MMLU
92
90
GPQA
90
88
SuperGPQA
88
86
OpenBookQA
86

Coding

o1-preview

Kimi K2.5 (Reasoning)

84

o1-preview

86

84
HumanEval
86

Mathematics

Tie

Kimi K2.5 (Reasoning)

93

o1-preview

93

94
AIME 2023
94
96
AIME 2024
96
95
AIME 2025
95
90
HMMT Feb 2023
90
92
HMMT Feb 2024
92
91
HMMT Feb 2025
91
93
BRUMO 2025
93

Reasoning

Tie

Kimi K2.5 (Reasoning)

87

o1-preview

87

88
SimpleQA
88
86
MuSR
86

Frequently Asked Questions

Which is better, Kimi K2.5 (Reasoning) or o1-preview?

Kimi K2.5 (Reasoning) and o1-preview are tied with identical overall scores of 71.

Which is better for knowledge tasks, Kimi K2.5 (Reasoning) or o1-preview?

Kimi K2.5 (Reasoning) and o1-preview are tied for knowledge tasks with average scores of 89.

Which is better for coding, Kimi K2.5 (Reasoning) or o1-preview?

o1-preview leads in coding with an average score of 86 vs 84.

Which is better for math, Kimi K2.5 (Reasoning) or o1-preview?

Kimi K2.5 (Reasoning) and o1-preview are tied for math with average scores of 93.

Which is better for reasoning, Kimi K2.5 (Reasoning) or o1-preview?

Kimi K2.5 (Reasoning) and o1-preview are tied for reasoning with average scores of 87.