The most recent AIME examination, featuring 15 challenging mathematics problems testing olympiad-level mathematical reasoning with integer answers from 000-999.
As of April 20, 2026, Kimi K2.5 (Reasoning) leads the AIME 2025 leaderboard with 96.1% , followed by Kimi K2.5 (96.1%) and GLM-4.7 (95.7%).
Kimi K2.5 (Reasoning)
Moonshot AI
Kimi K2.5
Moonshot AI
GLM-4.7
Z.AI
According to BenchLM.ai, Kimi K2.5 (Reasoning) leads the AIME 2025 benchmark with a score of 96.1%, followed by Kimi K2.5 (96.1%) and GLM-4.7 (95.7%). The top models are clustered within 0.4 points, suggesting this benchmark is nearing saturation for frontier models.
6 models have been evaluated on AIME 2025. The benchmark falls in the Math category. This category carries a 5% weight in BenchLM.ai's overall scoring system. Within that category, AIME 2025 contributes 25% of the category score, so strong performance here directly affects a model's overall ranking.
Year
2025
Tasks
15 problems
Format
Integer answers 000-999
Difficulty
High school olympiad level
AIME 2025 represents the current standard for intermediate-level mathematical olympiad problems. Success requires sophisticated mathematical reasoning and problem-solving techniques.
Version
AIME 2025
Refresh cadence
Annual
Staleness state
Current
Question availability
Public benchmark set
BenchLM uses freshness metadata to decide whether a benchmark should still be treated as a strong differentiator, a benchmark to watch, or a display-only reference. For the full scoring policy, see the BenchLM methodology page.
The most recent AIME examination, featuring 15 challenging mathematics problems testing olympiad-level mathematical reasoning with integer answers from 000-999.
Kimi K2.5 (Reasoning) by Moonshot AI currently leads with a score of 96.1% on AIME 2025.
6 AI models have been evaluated on AIME 2025 on BenchLM.
For engineers, researchers, and the plain curious — a weekly brief on new models, ranking shifts, and pricing changes.
Free. No spam. Unsubscribe anytime.