The most recent AIME examination, featuring 15 challenging mathematics problems testing olympiad-level mathematical reasoning with integer answers from 000-999.
As of April 10, 2026, Kimi K2.5 (Reasoning) leads the AIME 2025 leaderboard with 96.1% , followed by GLM-4.7 (95.7%) and MiMo-V2-Flash (94.1%).
Kimi K2.5 (Reasoning)
Moonshot AI
GLM-4.7
Z.AI
MiMo-V2-Flash
Xiaomi
According to BenchLM.ai, Kimi K2.5 (Reasoning) leads the AIME 2025 benchmark with a score of 96.1%, followed by GLM-4.7 (95.7%) and MiMo-V2-Flash (94.1%). The top models are clustered within 2.0 points, suggesting this benchmark is nearing saturation for frontier models.
5 models have been evaluated on AIME 2025. The benchmark falls in the Math category. This category carries a 5% weight in BenchLM.ai's overall scoring system. Within that category, AIME 2025 contributes 25% of the category score, so strong performance here directly affects a model's overall ranking.
Year
2025
Tasks
15 problems
Format
Integer answers 000-999
Difficulty
High school olympiad level
AIME 2025 represents the current standard for intermediate-level mathematical olympiad problems. Success requires sophisticated mathematical reasoning and problem-solving techniques.
Version
AIME 2025
Refresh cadence
Annual
Staleness state
Current
Question availability
Public benchmark set
BenchLM uses freshness metadata to decide whether a benchmark should still be treated as a strong differentiator, a benchmark to watch, or a display-only reference. For the full scoring policy, see the BenchLM methodology page.
The most recent AIME examination, featuring 15 challenging mathematics problems testing olympiad-level mathematical reasoning with integer answers from 000-999.
Kimi K2.5 (Reasoning) by Moonshot AI currently leads with a score of 96.1% on AIME 2025.
5 AI models have been evaluated on AIME 2025 on BenchLM.
For engineers, researchers, and the plain curious — a weekly brief on new models, ranking shifts, and pricing changes.
Free. No spam. Unsubscribe anytime.