Harvard-MIT Mathematics Tournament February 2025 (HMMT Feb 2025)

Name: Harvard-MIT Mathematics Tournament February 2025
Creator: BenchLM

The most recent February edition of the Harvard-MIT Mathematics Tournament, featuring the latest challenging problems in competitive mathematics.

How BenchLM shows HMMT Feb 2025 right now

BenchLM is tracking HMMT Feb 2025 in the local dataset, but exact-source verification records for these rows are still being attached. To avoid a blank benchmark page, BenchLM shows the current tracked rows below as a display-only reference table.

These tracked rows are useful for inspection and spot-checking, but until exact-source attachments are completed they should not be treated as fully verified public benchmark rows.

106 tracked modelsLocal tracked rowsAwaiting exact-source attachmentsDisplay only

Harvard-MIT Mathematics Tournament

Tracked score on HMMT Feb 2025 — June 2, 2026

BenchLM mirrors the published tracked score view for HMMT Feb 2025. GLM-4.7 leads the public snapshot at 97.1% , followed by GPT-5.4 (97%) and GPT-5.2 Pro (97%). BenchLM does not use these results to rank models overall.

GLM-4.7

Z.AI

glm-4-7

97.1%

Overall —

GPT-5.4

OpenAI

gpt-5-4

97%

Overall —

GPT-5.2 Pro

OpenAI

gpt-5-2-pro

97%

Overall —

106 modelsMathCurrentDisplay onlyUpdated June 2, 2026

The published HMMT Feb 2025 snapshot is tightly clustered at the top: GLM-4.7 sits at 97.1%, while the third row is only 0.1 points behind. The broader top-10 spread is 1.1 points, so many of the published scores sit in a relatively narrow band.

106 models have been evaluated on HMMT Feb 2025. The benchmark falls in the Math category. This category carries a 5% weight in BenchLM.ai's overall scoring system. HMMT Feb 2025 is currently displayed for reference but excluded from the scoring formula, so it does not directly affect overall rankings.

About HMMT Feb 2025

Year

2025

Tasks

Tournament problems

Format

Competition mathematics

Difficulty

High school olympiad level

HMMT Feb 2025 represents the current pinnacle of high school mathematics competition, with problems designed to challenge the brightest mathematical minds.

Harvard-MIT Mathematics Tournament Public benchmark source

BenchLM freshness & provenance

Version

HMMT Feb 2025 2025

Refresh cadence

Quarterly

Staleness state

Current

Question availability

Public benchmark set

CurrentDisplay only

BenchLM uses freshness metadata to decide whether a benchmark should still be treated as a strong differentiator, a benchmark to watch, or a display-only reference. For the full scoring policy, see the BenchLM methodology page.

Tracked score table (106 models)

GLM-4.7glm-4-7

Z.AI

97.1%

GPT-5.4gpt-5-4

OpenAI

97%

GPT-5.2 Progpt-5-2-pro

OpenAI

97%

GPT-5.1-Codex-Maxgpt-5-1-codex-max

OpenAI

96%

GPT-5.2-Codexgpt-5-2-codex

OpenAI

96%

GPT-5.3 Codexgpt-5-3-codex

OpenAI

96%

Grok 4.1grok-4-1

xAI

96%

Gemini 3 Pro Deep Thinkgemini-3-pro-deep-think

Google

96%

Claude Opus 4.6claude-opus-4-6

Anthropic

96%

GPT-5.1gpt-5-1

OpenAI

96%

GPT-5.2gpt-5-2

OpenAI

96%

Claude Sonnet 4.6claude-sonnet-4-6

Anthropic

96%

Gemini 3 Progemini-3-pro

Google

96%

Claude Opus 4.5claude-opus-4-5

Anthropic

96%

GPT-5.3 Instantgpt-5-3-instant

OpenAI

96%

GPT-5.2 Instantgpt-5-2-instant

OpenAI

96%

Kimi K2.5 (Reasoning)kimi-k2-5-reasoning

Moonshot AI

95.4%

GLM-5 (Reasoning)glm-5-reasoning

Z.AI

95%

GPT-5.3-Codex-Sparkgpt-5-3-codex-spark

OpenAI

95%

Claude Sonnet 4.5claude-sonnet-4-5

Anthropic

94%

Grok 4.1 Fastgrok-4-1-fast

xAI

93%

GPT-5 (high)gpt-5-high

OpenAI

92%

o1-preview

OpenAI

91%

GPT-5 (medium)gpt-5-medium

OpenAI

90%

Qwen3.5 397B (Reasoning)qwen3-5-397b-reasoning

Alibaba

90%

o3-pro

OpenAI

87%

GPT-5 minigpt-5-mini

OpenAI

87%

OpenAI

85%

GLM-5glm-5

Z.AI

85%

Grok 4grok-4

xAI

85%

DeepSeek V3.2 (Thinking)deepseek-v3-2-thinking

DeepSeek

84%

Qwen2.5-1Mqwen2-5-1m

Alibaba

82%

Step 3.5 Flashstep-3-5-flash

StepFun

82%

Gemini 2.5 Progemini-2-5-pro

Google

81%

Qwen2.5-72Bqwen2-5-72b

Alibaba

81%

DeepSeek V3.2deepseek-v3-2

DeepSeek

81%

Qwen3.5 397Bqwen3-5-397b

Alibaba

80%

o4-mini (high)o4-mini-high

OpenAI

80%

DeepSeek Coder 2.0deepseek-coder-2-0

DeepSeek

78%

Mercury 2mercury-2

Inception

78%

DeepSeekMath V2deepseekmath-v2

DeepSeek

77%

DeepSeek LLM 2.0deepseek-llm-2-0

DeepSeek

77%

MiMo-V2-Flashmimo-v2-flash

Xiaomi

76%

Kimi K2.5kimi-k2-5

Moonshot AI

74%

Claude 4.1 Opusclaude-4-1-opus

Anthropic

73%

Mistral Large 3mistral-large-3

Mistral

73%

Aion-2.0aion-2-0

Aion Labs

71%

Claude 4 Sonnetclaude-4-sonnet

Anthropic

70%

MiniMax M2.5minimax-m2-5

MiniMax

70%

Seed 1.6seed-1-6

ByteDance

69%

Seed-2.0-Liteseed-2-0-lite

ByteDance

68%

Ministral 3 14B (Reasoning)ministral-3-14b-reasoning

Mistral

67.5%

Gemini 3 Flashgemini-3-flash

Google

67%

Llama 3.1 405Bllama-3-1-405b

Meta

67%

Claude Haiku 4.5claude-haiku-4-5

Anthropic

65%

Mistral Large 2mistral-large-2

Mistral

65%

Ministral 3 14Bministral-3-14b

Mistral

65%

Nemotron 3 Super 120B A12Bnemotron-3-super-120b-a12b

NVIDIA

64%

GPT-4ogpt-4o

OpenAI

63%

GLM-4.7-Flashglm-4-7-flash

Z.AI

63%

Nemotron 3 Super 100Bnemotron-3-super-100b

NVIDIA

62%

Claude 3.5 Sonnetclaude-3-5-sonnet

Anthropic

62%

Mistral 8x7Bmistral-8x7b

Mistral

62%

Grok Code Fast 1grok-code-fast-1

xAI

61%

Gemini 1.5 Progemini-1-5-pro

Google

61%

Seed 1.6 Flashseed-1-6-flash

ByteDance

61%

Gemini 3.1 Flash-Litegemini-3-1-flash-lite

Google

60%

Gemini 1.0 Progemini-1-0-pro

Google

59%

Seed-2.0-Miniseed-2-0-mini

ByteDance

59%

Claude 3 Opusclaude-3-opus

Anthropic

58%

GPT-4 Turbogpt-4-turbo

OpenAI

57%

Llama 3 70Bllama-3-70b

Meta

55%

Nemotron 3 Nano 30Bnemotron-3-nano-30b

NVIDIA

54%

Claude 3 Haikuclaude-3-haiku

Anthropic

53%

Nemotron-4 15Bnemotron-4-15b

NVIDIA

51%

Moonshot v1moonshot-v1

Moonshot AI

50%

Z-1z-1

49%

GPT-OSS 120Bgpt-oss-120b

OpenAI

48%

Gemini 2.5 Flashgemini-2-5-flash

Google

47%

Nemotron Ultra 253Bnemotron-ultra-253b

NVIDIA

46%

Llama 4 Behemothllama-4-behemoth

Meta

45%

Llama 4 Scoutllama-4-scout

Meta

44%

Llama 4 Maverickllama-4-maverick

Meta

43%

LFM2-24B-A2Blfm2-24b-a2b

LiquidAI

43%

Gemma 3 27Bgemma-3-27b

Google

42%

DeepSeek-R1deepseek-r1

DeepSeek

41%

Grok 3 [Beta]grok-3-beta

xAI

39%

Kimi K2kimi-k2

Moonshot AI

38.8%

Nova Pronova-pro

Amazon

38%

Qwen3 235B 2507 (Reasoning)qwen3-235b-2507-reasoning

Alibaba

37%

Qwen3 235B 2507qwen3-235b-2507

Alibaba

36%

Claude 4.1 Opus Thinkingclaude-4-1-opus-thinking

Anthropic

35%

GLM-4.5glm-4-5

Z.AI

34%

MiniMax M1 80kminimax-m1-80k

MiniMax

33%

GLM-4.5-Airglm-4-5-air

Z.AI

32%

DeepSeek V3.1 (Reasoning)deepseek-v3-1-reasoning

DeepSeek

31%

DeepSeek V3.1deepseek-v3-1

DeepSeek

30%

Ministral 3 8B (Reasoning)ministral-3-8b-reasoning

Mistral

30%

GPT-OSS 20Bgpt-oss-20b

OpenAI

28%

100

Mistral 7B v0.3mistral-7b-v0-3

Mistral

27%

101

Ministral 3 8Bministral-3-8b

Mistral

27%

102

Mistral 8x7B v0.2mistral-8x7b-v0-2

Mistral

26%

103

LFM2.5-1.2B-Thinkinglfm2-5-1-2b-thinking

LiquidAI

25%

104

Ministral 3 3B (Reasoning)ministral-3-3b-reasoning

Mistral

24%

105

LFM2.5-1.2B-Instructlfm2-5-1-2b-instruct

LiquidAI

21%

106

Ministral 3 3Bministral-3-3b

Mistral

20%

FAQ

What does HMMT Feb 2025 measure?

The most recent February edition of the Harvard-MIT Mathematics Tournament, featuring the latest challenging problems in competitive mathematics.

Which model leads the published HMMT Feb 2025 snapshot?

GLM-4.7 currently leads the published HMMT Feb 2025 snapshot with 97.1% tracked score. BenchLM shows this benchmark for display only and does not use it in overall rankings.

How many models are evaluated on HMMT Feb 2025?

106 AI models are included in BenchLM's mirrored HMMT Feb 2025 snapshot, based on the public leaderboard captured on June 2, 2026.

Learn More

Read our explainer: HMMT Feb 2025 benchmark deep dive

Last updated: June 2, 2026 · mirrored from the public benchmark leaderboard

The AI models change fast. We track them for you.

For engineers, researchers, and the plain curious — a weekly brief on new models, ranking shifts, and pricing changes.

Free. No spam. Unsubscribe anytime.