American Invitational Mathematics Examination 2025 (AIME 2025)

Name: American Invitational Mathematics Examination 2025
Creator: BenchLM

The most recent AIME examination, featuring 15 challenging mathematics problems testing olympiad-level mathematical reasoning with integer answers from 000-999.

Top models on AIME 2025 — June 2, 2026

As of June 2, 2026, Kimi K2.5 (Reasoning) leads the AIME 2025 leaderboard with 96.1% , followed by Kimi K2.5 (96.1%) and GLM-4.7 (95.7%).

1Closed

Kimi K2.5 (Reasoning)

Moonshot AI

96.1%

Overall 76Context 128K

2Open

Kimi K2.5

Moonshot AI

96.1%

Overall 64Context 256K

3Open

GLM-4.7

Z.AI

95.7%

Overall ~68Context 200K

9 modelsMath25% of category scoreCurrentUpdated June 2, 2026

According to BenchLM.ai, Kimi K2.5 (Reasoning) leads the AIME 2025 benchmark with a score of 96.1%, followed by Kimi K2.5 (96.1%) and GLM-4.7 (95.7%). The top models are clustered within 0.4 points, suggesting this benchmark is nearing saturation for frontier models.

9 models have been evaluated on AIME 2025. The benchmark falls in the Math category. This category carries a 5% weight in BenchLM.ai's overall scoring system. Within that category, AIME 2025 contributes 25% of the category score, so strong performance here directly affects a model's overall ranking.

About AIME 2025

Year

2025

Tasks

15 problems

Format

Integer answers 000-999

Difficulty

High school olympiad level

AIME 2025 represents the current standard for intermediate-level mathematical olympiad problems. Success requires sophisticated mathematical reasoning and problem-solving techniques.

American Invitational Mathematics Examination

BenchLM freshness & provenance

Version

AIME 2025

Refresh cadence

Annual

Staleness state

Current

Question availability

Public benchmark set

Current

BenchLM uses freshness metadata to decide whether a benchmark should still be treated as a strong differentiator, a benchmark to watch, or a display-only reference. For the full scoring policy, see the BenchLM methodology page.

Leaderboard (9 models)

Kimi K2.5 (Reasoning)

Moonshot AIClosed

96.1%

Kimi K2.5

Moonshot AIOpen

96.1%

GLM-4.7

Z.AIOpen

95.7%

MiMo-V2-Flash

XiaomiOpen

94.1%

Claude Sonnet 4.5

AnthropicClosed

87%

Exaone 4.0 32B

LG AI ResearchOpen

85.3%

Nemotron 3 Nano Omni 30B A3B

NVIDIAOpen

82.1%

LFM2.5-8B-A1B

LiquidAIOpen

42.5%

MiniCPM5-1B

OpenBMBOpen

40.4%

FAQ

What does AIME 2025 measure?

The most recent AIME examination, featuring 15 challenging mathematics problems testing olympiad-level mathematical reasoning with integer answers from 000-999.

Which model scores highest on AIME 2025?

Kimi K2.5 (Reasoning) by Moonshot AI currently leads with a score of 96.1% on AIME 2025.

How many models are evaluated on AIME 2025?

9 AI models have been evaluated on AIME 2025 on BenchLM.

Compare Top Models on AIME 2025

Kimi K2.5 (Reasoning) vs Kimi K2.5 Kimi K2.5 vs GLM-4.7 GLM-4.7 vs MiMo-V2-Flash MiMo-V2-Flash vs Claude Sonnet 4.5

Learn More

Read our explainer: AIME 2025 benchmark deep dive

Last updated: June 2, 2026 · BenchLM version AIME 2025

The AI models change fast. We track them for you.

For engineers, researchers, and the plain curious — a weekly brief on new models, ranking shifts, and pricing changes.

Free. No spam. Unsubscribe anytime.