Benchmark profile

Harvard-MIT Mathematics Tournament February 2023 (HMMT Feb 2023)

A prestigious high school mathematics competition hosted jointly by Harvard and MIT, featuring challenging problems across various mathematical disciplines.

Data verified July 20, 2026

How BenchLM shows HMMT Feb 2023 right now

BenchLM is tracking HMMT Feb 2023 in the local dataset, but exact-source verification records for these rows are still being attached. To avoid a blank benchmark page, BenchLM shows the current tracked rows below as a display-only reference table.

These tracked rows are useful for inspection and spot-checking, but until exact-source attachments are completed they should not be treated as fully verified public benchmark rows.

105 tracked modelsLocal tracked rowsAwaiting exact-source attachmentsDisplay only

Harvard-MIT Mathematics Tournament

Tracked score on HMMT Feb 2023 — July 20, 2026

BenchLM mirrors the published tracked score view for HMMT Feb 2023. GPT-5.4 leads the public snapshot at 96% , followed by GPT-5.2 Pro (96%) and GPT-5.1-Codex-Max (95%). BenchLM does not use these results to rank models overall.

GPT-5.4

OpenAI

gpt-5-4

96%

Overall —

GPT-5.2 Pro

OpenAI

gpt-5-2-pro

96%

Overall —

GPT-5.1-Codex-Max

OpenAI

gpt-5-1-codex-max

95%

Overall —

105 modelsMathStaleDisplay onlyUpdated July 20, 2026

Tracked score table (105 models)

Score

GPT-5.4OpenAI

96%

GPT-5.2 ProOpenAI

96%

GPT-5.1-Codex-MaxOpenAI

95%

GPT-5.2-CodexOpenAI

95%

GPT-5.3 CodexOpenAI

95%

Grok 4.1xAI

95%

Gemini 3 Pro Deep ThinkGoogle

95%

Claude Opus 4.6Anthropic

95%

GPT-5.1OpenAI

95%

GPT-5.2OpenAI

95%

Claude Sonnet 4.6Anthropic

95%

Gemini 3 ProGoogle

95%

Claude Opus 4.5Anthropic

95%

GPT-5.3 InstantOpenAI

95%

GPT-5.2 InstantOpenAI

95%

GLM-5 (Reasoning)Z.AI

94%

GPT-5.3-Codex-SparkOpenAI

94%

Claude Sonnet 4.5Anthropic

93%

Grok 4.1 FastxAI

92%

GPT-5 (high)OpenAI

91%

o1-previewOpenAI

90%

Kimi K2.5 (Reasoning)Moonshot AI

90%

GPT-5 (medium)OpenAI

89%

Qwen3.5 397B (Reasoning)Alibaba

89%

o3-proOpenAI

86%

GPT-5 miniOpenAI

86%

o3OpenAI

84%

GLM-5Z.AI

84%

Grok 4xAI

84%

DeepSeek V3.2 (Thinking)DeepSeek

83%

GLM-4.7Z.AI

82%

Qwen2.5-1MAlibaba

81%

Step 3.5 FlashStepFun

81%

Gemini 2.5 ProGoogle

80%

Qwen2.5-72BAlibaba

80%

DeepSeek V3.2DeepSeek

80%

Qwen3.5 397BAlibaba

79%

o4-mini (high)OpenAI

79%

DeepSeek Coder 2.0DeepSeek

77%

Mercury 2Inception

77%

DeepSeekMath V2DeepSeek

76%

DeepSeek LLM 2.0DeepSeek

76%

MiMo-V2-FlashXiaomi

75%

Kimi K2.5Moonshot AI

73%

Claude 4.1 OpusAnthropic

72%

Mistral Large 3Mistral

72%

Aion-2.0Aion Labs

70%

Claude 4 SonnetAnthropic

69%

Ministral 3 14B (Reasoning)Mistral

69%

MiniMax M2.5MiniMax

69%

Seed 1.6ByteDance

68%

Seed-2.0-LiteByteDance

67%

Gemini 3 FlashGoogle

66%

Llama 3.1 405BMeta

66%

Claude Haiku 4.5Anthropic

64%

Mistral Large 2Mistral

64%

Ministral 3 14BMistral

64%

Nemotron 3 Super 120B A12BNVIDIA

63%

GPT-4oOpenAI

62%

GLM-4.7-FlashZ.AI

62%

Nemotron 3 Super 100BNVIDIA

61%

Claude 3.5 SonnetAnthropic

61%

Mistral 8x7BMistral

61%

Grok Code Fast 1xAI

60%

Gemini 1.5 ProGoogle

60%

Seed 1.6 FlashByteDance

60%

Gemini 3.1 Flash-LiteGoogle

59%

Gemini 1.0 ProGoogle

58%

Seed-2.0-MiniByteDance

58%

Claude 3 OpusAnthropic

57%

GPT-4 TurboOpenAI

56%

Llama 3 70BMeta

54%

Nemotron 3 Nano 30BNVIDIA

53%

Claude 3 HaikuAnthropic

52%

Nemotron-4 15BNVIDIA

50%

Moonshot v1Moonshot AI

49%

Z-1Z

48%

GPT-OSS 120BOpenAI

47%

Gemini 2.5 FlashGoogle

46%

Nemotron Ultra 253BNVIDIA

45%

Llama 4 BehemothMeta

44%

Llama 4 ScoutMeta

43%

Llama 4 MaverickMeta

42%

LFM2-24B-A2BLiquidAI

42%

Gemma 3 27BGoogle

41%

DeepSeek-R1DeepSeek

40%

Grok 3 [Beta]xAI

38%

Nova ProAmazon

37%

Qwen3 235B 2507 (Reasoning)Alibaba

36%

Qwen3 235B 2507Alibaba

35%

Claude 4.1 Opus ThinkingAnthropic

34%

GLM-4.5Z.AI

33%

MiniMax M1 80kMiniMax

32%

GLM-4.5-AirZ.AI

31%

DeepSeek V3.1 (Reasoning)DeepSeek

30%

DeepSeek V3.1DeepSeek

29%

Ministral 3 8B (Reasoning)Mistral

29%

GPT-OSS 20BOpenAI

27%

Mistral 7B v0.3Mistral

26%

100

Ministral 3 8BMistral

26%

101

Mistral 8x7B v0.2Mistral

25%

102

LFM2.5-1.2B-ThinkingLiquidAI

24%

103

Ministral 3 3B (Reasoning)Mistral

23%

104

LFM2.5-1.2B-InstructLiquidAI

20%

105

Ministral 3 3BMistral

19%

The published HMMT Feb 2023 snapshot places GPT-5.4 first at 96%. The third row is 1.0 points behind. The broader top-10 range is 1.0 points, so many of the published results sit in a relatively narrow band.

105 models have been evaluated on HMMT Feb 2023. The benchmark falls in the Math category. This category carries a 5% weight in BenchLM.ai's overall scoring system. HMMT Feb 2023 is currently displayed for reference but excluded from the scoring formula, so it does not directly affect overall rankings.

About HMMT Feb 2023

Year

2023

Tasks

Tournament problems

Format

Competition mathematics

Difficulty

High school olympiad level

HMMT is one of the most competitive high school mathematics tournaments in the US. Problems span algebra, geometry, combinatorics, and number theory, requiring deep mathematical insight.

Harvard-MIT Mathematics Tournament Public benchmark source

BenchLM freshness & provenance

Version

HMMT Feb 2023 2023

Refresh cadence

Static

Staleness state

Stale

Question availability

Public benchmark set

StaleDisplay only

BenchLM uses freshness metadata to decide whether a benchmark should still be treated as a strong differentiator, a benchmark to watch, or a display-only reference. For the full scoring policy, see the BenchLM methodology page.

FAQ

What does HMMT Feb 2023 measure?

A prestigious high school mathematics competition hosted jointly by Harvard and MIT, featuring challenging problems across various mathematical disciplines.

Which model leads the published HMMT Feb 2023 snapshot?

GPT-5.4 currently leads the published HMMT Feb 2023 snapshot with 96% tracked score. BenchLM shows this benchmark for display only and does not use it in overall rankings.

How many models are evaluated on HMMT Feb 2023?

105 AI models are included in BenchLM's mirrored HMMT Feb 2023 snapshot, based on the public leaderboard captured on July 20, 2026.

Learn More

Read our explainer: HMMT Feb 2023 benchmark deep dive

Last updated: July 20, 2026 · mirrored from the public benchmark leaderboard

Choose a model with this week’s evidence

Join 2,000+ readers for ranking moves, pricing changes, and the claims that still need proof.

One email each week. Unsubscribe anytime.