Skip to main content

MATH-500 Problem Set (MATH-500)

A curated subset of 500 problems from the MATH dataset, covering algebra, counting and probability, geometry, intermediate algebra, number theory, prealgebra, and precalculus.

Top models on MATH-500 — June 2, 2026

As of June 2, 2026, MiniCPM5-1B leads the MATH-500 leaderboard with 91.6% , followed by LFM2.5-8B-A1B (88.8%).

2 modelsMath15% of category scoreStaleUpdated June 2, 2026

About MATH-500

Year

2021

Tasks

500 problems

Format

Free-form mathematical answers

Difficulty

High school to undergraduate

MATH-500 is one of the most widely cited math benchmarks. It is nearing saturation with top reasoning models scoring 96-99%, making it less useful for differentiating frontier models but still a standard baseline.

BenchLM freshness & provenance

Version

MATH-500 2021

Refresh cadence

Static

Staleness state

Stale

Question availability

Public benchmark set

Stale

BenchLM uses freshness metadata to decide whether a benchmark should still be treated as a strong differentiator, a benchmark to watch, or a display-only reference. For the full scoring policy, see the BenchLM methodology page.

Leaderboard (2 models)

1
91.6%
2
88.8%

FAQ

What does MATH-500 measure?

A curated subset of 500 problems from the MATH dataset, covering algebra, counting and probability, geometry, intermediate algebra, number theory, prealgebra, and precalculus.

Which model scores highest on MATH-500?

MiniCPM5-1B by OpenBMB currently leads with a score of 91.6% on MATH-500.

How many models are evaluated on MATH-500?

2 AI models have been evaluated on MATH-500 on BenchLM.

Compare Top Models on MATH-500

Last updated: June 2, 2026 · BenchLM version MATH-500 2021

The AI models change fast. We track them for you.

For engineers, researchers, and the plain curious — a weekly brief on new models, ranking shifts, and pricing changes.

Free. No spam. Unsubscribe anytime.