Harvard-MIT Mathematics Tournament February 2024 (HMMT Feb 2024)

Name: Harvard-MIT Mathematics Tournament February 2024
Creator: BenchLM

The 2024 February edition of the Harvard-MIT Mathematics Tournament, continuing the tradition of challenging high school mathematics competition.

How BenchLM shows HMMT Feb 2024 right now

BenchLM is tracking HMMT Feb 2024 in the local dataset, but exact-source verification records for these rows are still being attached. To avoid a blank benchmark page, BenchLM shows the current tracked rows below as a display-only reference table.

These tracked rows are useful for inspection and spot-checking, but until exact-source attachments are completed they should not be treated as fully verified public benchmark rows.

105 tracked modelsLocal tracked rowsAwaiting exact-source attachmentsDisplay only

Harvard-MIT Mathematics Tournament

Tracked score on HMMT Feb 2024 — June 2, 2026

BenchLM mirrors the published tracked score view for HMMT Feb 2024. GPT-5.4 leads the public snapshot at 98% , followed by GPT-5.2 Pro (98%) and GPT-5.1-Codex-Max (97%). BenchLM does not use these results to rank models overall.

GPT-5.4

OpenAI

gpt-5-4

98%

Overall —

GPT-5.2 Pro

OpenAI

gpt-5-2-pro

98%

Overall —

GPT-5.1-Codex-Max

OpenAI

gpt-5-1-codex-max

97%

Overall —

105 modelsMathRefreshingDisplay onlyUpdated June 2, 2026

The published HMMT Feb 2024 snapshot is tightly clustered at the top: GPT-5.4 sits at 98%, while the third row is only 1.0 points behind. The broader top-10 spread is 1.0 points, so many of the published scores sit in a relatively narrow band.

105 models have been evaluated on HMMT Feb 2024. The benchmark falls in the Math category. This category carries a 5% weight in BenchLM.ai's overall scoring system. HMMT Feb 2024 is currently displayed for reference but excluded from the scoring formula, so it does not directly affect overall rankings.

About HMMT Feb 2024

Year

2024

Tasks

Tournament problems

Format

Competition mathematics

Difficulty

High school olympiad level

HMMT Feb 2024 maintains the high standards of mathematical rigor and creativity expected from this premier competition. Problems test advanced mathematical reasoning skills.

Harvard-MIT Mathematics Tournament Public benchmark source

BenchLM freshness & provenance

Version

HMMT Feb 2024 2024

Refresh cadence

Annual

Staleness state

Refreshing

Question availability

Public benchmark set

RefreshingDisplay only

BenchLM uses freshness metadata to decide whether a benchmark should still be treated as a strong differentiator, a benchmark to watch, or a display-only reference. For the full scoring policy, see the BenchLM methodology page.

Tracked score table (105 models)

GPT-5.4gpt-5-4

OpenAI

98%

GPT-5.2 Progpt-5-2-pro

OpenAI

98%

GPT-5.1-Codex-Maxgpt-5-1-codex-max

OpenAI

97%

GPT-5.2-Codexgpt-5-2-codex

OpenAI

97%

GPT-5.3 Codexgpt-5-3-codex

OpenAI

97%

Grok 4.1grok-4-1

xAI

97%

Gemini 3 Pro Deep Thinkgemini-3-pro-deep-think

Google

97%

Claude Opus 4.6claude-opus-4-6

Anthropic

97%

GPT-5.1gpt-5-1

OpenAI

97%

GPT-5.2gpt-5-2

OpenAI

97%

Claude Sonnet 4.6claude-sonnet-4-6

Anthropic

97%

Gemini 3 Progemini-3-pro

Google

97%

Claude Opus 4.5claude-opus-4-5

Anthropic

97%

GPT-5.3 Instantgpt-5-3-instant

OpenAI

97%

GPT-5.2 Instantgpt-5-2-instant

OpenAI

97%

GLM-5 (Reasoning)glm-5-reasoning

Z.AI

96%

GPT-5.3-Codex-Sparkgpt-5-3-codex-spark

OpenAI

96%

Claude Sonnet 4.5claude-sonnet-4-5

Anthropic

95%

Grok 4.1 Fastgrok-4-1-fast

xAI

94%

GPT-5 (high)gpt-5-high

OpenAI

93%

o1-preview

OpenAI

92%

Kimi K2.5 (Reasoning)kimi-k2-5-reasoning

Moonshot AI

92%

GPT-5 (medium)gpt-5-medium

OpenAI

91%

Qwen3.5 397B (Reasoning)qwen3-5-397b-reasoning

Alibaba

91%

o3-pro

OpenAI

88%

GPT-5 minigpt-5-mini

OpenAI

88%

OpenAI

86%

GLM-5glm-5

Z.AI

86%

Grok 4grok-4

xAI

86%

DeepSeek V3.2 (Thinking)deepseek-v3-2-thinking

DeepSeek

85%

GLM-4.7glm-4-7

Z.AI

84%

Qwen2.5-1Mqwen2-5-1m

Alibaba

83%

Step 3.5 Flashstep-3-5-flash

StepFun

83%

Gemini 2.5 Progemini-2-5-pro

Google

82%

Qwen2.5-72Bqwen2-5-72b

Alibaba

82%

DeepSeek V3.2deepseek-v3-2

DeepSeek

82%

Qwen3.5 397Bqwen3-5-397b

Alibaba

81%

o4-mini (high)o4-mini-high

OpenAI

81%

DeepSeek Coder 2.0deepseek-coder-2-0

DeepSeek

79%

Mercury 2mercury-2

Inception

79%

DeepSeekMath V2deepseekmath-v2

DeepSeek

78%

DeepSeek LLM 2.0deepseek-llm-2-0

DeepSeek

78%

MiMo-V2-Flashmimo-v2-flash

Xiaomi

77%

Kimi K2.5kimi-k2-5

Moonshot AI

75%

Claude 4.1 Opusclaude-4-1-opus

Anthropic

74%

Mistral Large 3mistral-large-3

Mistral

74%

Aion-2.0aion-2-0

Aion Labs

72%

Claude 4 Sonnetclaude-4-sonnet

Anthropic

71%

Ministral 3 14B (Reasoning)ministral-3-14b-reasoning

Mistral

71%

MiniMax M2.5minimax-m2-5

MiniMax

71%

Seed 1.6seed-1-6

ByteDance

70%

Seed-2.0-Liteseed-2-0-lite

ByteDance

69%

Gemini 3 Flashgemini-3-flash

Google

68%

Llama 3.1 405Bllama-3-1-405b

Meta

68%

Claude Haiku 4.5claude-haiku-4-5

Anthropic

66%

Mistral Large 2mistral-large-2

Mistral

66%

Ministral 3 14Bministral-3-14b

Mistral

66%

Nemotron 3 Super 120B A12Bnemotron-3-super-120b-a12b

NVIDIA

65%

GPT-4ogpt-4o

OpenAI

64%

GLM-4.7-Flashglm-4-7-flash

Z.AI

64%

Nemotron 3 Super 100Bnemotron-3-super-100b

NVIDIA

63%

Claude 3.5 Sonnetclaude-3-5-sonnet

Anthropic

63%

Mistral 8x7Bmistral-8x7b

Mistral

63%

Grok Code Fast 1grok-code-fast-1

xAI

62%

Gemini 1.5 Progemini-1-5-pro

Google

62%

Seed 1.6 Flashseed-1-6-flash

ByteDance

62%

Gemini 3.1 Flash-Litegemini-3-1-flash-lite

Google

61%

Gemini 1.0 Progemini-1-0-pro

Google

60%

Seed-2.0-Miniseed-2-0-mini

ByteDance

60%

Claude 3 Opusclaude-3-opus

Anthropic

59%

GPT-4 Turbogpt-4-turbo

OpenAI

58%

Llama 3 70Bllama-3-70b

Meta

56%

Nemotron 3 Nano 30Bnemotron-3-nano-30b

NVIDIA

55%

Claude 3 Haikuclaude-3-haiku

Anthropic

54%

Nemotron-4 15Bnemotron-4-15b

NVIDIA

52%

Moonshot v1moonshot-v1

Moonshot AI

51%

Z-1z-1

50%

GPT-OSS 120Bgpt-oss-120b

OpenAI

49%

Gemini 2.5 Flashgemini-2-5-flash

Google

48%

Nemotron Ultra 253Bnemotron-ultra-253b

NVIDIA

47%

Llama 4 Behemothllama-4-behemoth

Meta

46%

Llama 4 Scoutllama-4-scout

Meta

45%

Llama 4 Maverickllama-4-maverick

Meta

44%

LFM2-24B-A2Blfm2-24b-a2b

LiquidAI

44%

Gemma 3 27Bgemma-3-27b

Google

43%

DeepSeek-R1deepseek-r1

DeepSeek

42%

Grok 3 [Beta]grok-3-beta

xAI

40%

Nova Pronova-pro

Amazon

39%

Qwen3 235B 2507 (Reasoning)qwen3-235b-2507-reasoning

Alibaba

38%

Qwen3 235B 2507qwen3-235b-2507

Alibaba

37%

Claude 4.1 Opus Thinkingclaude-4-1-opus-thinking

Anthropic

36%

GLM-4.5glm-4-5

Z.AI

35%

MiniMax M1 80kminimax-m1-80k

MiniMax

34%

GLM-4.5-Airglm-4-5-air

Z.AI

33%

DeepSeek V3.1 (Reasoning)deepseek-v3-1-reasoning

DeepSeek

32%

DeepSeek V3.1deepseek-v3-1

DeepSeek

31%

Ministral 3 8B (Reasoning)ministral-3-8b-reasoning

Mistral

31%

GPT-OSS 20Bgpt-oss-20b

OpenAI

29%

Mistral 7B v0.3mistral-7b-v0-3

Mistral

28%

100

Ministral 3 8Bministral-3-8b

Mistral

28%

101

Mistral 8x7B v0.2mistral-8x7b-v0-2

Mistral

27%

102

LFM2.5-1.2B-Thinkinglfm2-5-1-2b-thinking

LiquidAI

26%

103

Ministral 3 3B (Reasoning)ministral-3-3b-reasoning

Mistral

25%

104

LFM2.5-1.2B-Instructlfm2-5-1-2b-instruct

LiquidAI

22%

105

Ministral 3 3Bministral-3-3b

Mistral

21%

FAQ

What does HMMT Feb 2024 measure?

The 2024 February edition of the Harvard-MIT Mathematics Tournament, continuing the tradition of challenging high school mathematics competition.

Which model leads the published HMMT Feb 2024 snapshot?

GPT-5.4 currently leads the published HMMT Feb 2024 snapshot with 98% tracked score. BenchLM shows this benchmark for display only and does not use it in overall rankings.

How many models are evaluated on HMMT Feb 2024?

105 AI models are included in BenchLM's mirrored HMMT Feb 2024 snapshot, based on the public leaderboard captured on June 2, 2026.

Learn More

Read our explainer: HMMT Feb 2024 benchmark deep dive

Last updated: June 2, 2026 · mirrored from the public benchmark leaderboard

The AI models change fast. We track them for you.

For engineers, researchers, and the plain curious — a weekly brief on new models, ranking shifts, and pricing changes.

Free. No spam. Unsubscribe anytime.