A continuously updated benchmark using fresh competitive programming problems from LeetCode, Codeforces, and AtCoder to provide contamination-free code generation evaluation.
As of June 2, 2026, DeepSeek V4 Pro (Max) leads the LiveCodeBench leaderboard with 93.5% , followed by Qwen3.7 Max (91.6%) and DeepSeek V4 Flash (Max) (91.6%).
DeepSeek V4 Pro (Max)
DeepSeek
Qwen3.7 Max
Alibaba
DeepSeek V4 Flash (Max)
DeepSeek
According to BenchLM.ai, DeepSeek V4 Pro (Max) leads the LiveCodeBench benchmark with a score of 93.5%, followed by Qwen3.7 Max (91.6%) and DeepSeek V4 Flash (Max) (91.6%). The top models are clustered within 1.9 points, suggesting this benchmark is nearing saturation for frontier models.
14 models have been evaluated on LiveCodeBench. The benchmark falls in the Coding category. This category carries a 20% weight in BenchLM.ai's overall scoring system. Within that category, LiveCodeBench contributes 23% of the category score, so strong performance here directly affects a model's overall ranking.
Year
2024
Tasks
Continuously updated
Format
Competitive programming
Difficulty
Competitive programming level
LiveCodeBench addresses data contamination concerns by continuously sourcing new problems from competitive programming platforms. It evaluates code generation, self-repair, code execution, and test output prediction.
Version
Rolling 2026 set
Refresh cadence
Rolling
Staleness state
Current
Question availability
Delayed public release
BenchLM uses freshness metadata to decide whether a benchmark should still be treated as a strong differentiator, a benchmark to watch, or a display-only reference. For the full scoring policy, see the BenchLM methodology page.
A continuously updated benchmark using fresh competitive programming problems from LeetCode, Codeforces, and AtCoder to provide contamination-free code generation evaluation.
DeepSeek V4 Pro (Max) by DeepSeek currently leads with a score of 93.5% on LiveCodeBench.
14 AI models have been evaluated on LiveCodeBench on BenchLM.
For engineers, researchers, and the plain curious — a weekly brief on new models, ranking shifts, and pricing changes.
Free. No spam. Unsubscribe anytime.