A curated subset of 500 problems from the MATH dataset, covering algebra, counting and probability, geometry, intermediate algebra, number theory, prealgebra, and precalculus.
As of June 2, 2026, MiniCPM5-1B leads the MATH-500 leaderboard with 91.6% , followed by LFM2.5-8B-A1B (88.8%).
MiniCPM5-1B
OpenBMB
LFM2.5-8B-A1B
LiquidAI
Year
2021
Tasks
500 problems
Format
Free-form mathematical answers
Difficulty
High school to undergraduate
MATH-500 is one of the most widely cited math benchmarks. It is nearing saturation with top reasoning models scoring 96-99%, making it less useful for differentiating frontier models but still a standard baseline.
Version
MATH-500 2021
Refresh cadence
Static
Staleness state
Stale
Question availability
Public benchmark set
BenchLM uses freshness metadata to decide whether a benchmark should still be treated as a strong differentiator, a benchmark to watch, or a display-only reference. For the full scoring policy, see the BenchLM methodology page.
A curated subset of 500 problems from the MATH dataset, covering algebra, counting and probability, geometry, intermediate algebra, number theory, prealgebra, and precalculus.
MiniCPM5-1B by OpenBMB currently leads with a score of 91.6% on MATH-500.
2 AI models have been evaluated on MATH-500 on BenchLM.
For engineers, researchers, and the plain curious — a weekly brief on new models, ranking shifts, and pricing changes.
Free. No spam. Unsubscribe anytime.