SciCode evaluates language models on generating code for realistic scientific research problems across 16 subfields of physics, math, chemistry, biology, and material science. Problems decompose into 338 subproblems requiring domain knowledge recall, scientific reasoning, and precise code synthesis. Based on real scripts from published research.
As of May 22, 2026, Qwen3.7 Max leads the SciCode leaderboard with 53.5% , followed by Gemini 3.5 Flash (53.1%) and Kimi K2.6 (52.2%).
Qwen3.7 Max
Alibaba
Gemini 3.5 Flash
Kimi K2.6
Moonshot AI
According to BenchLM.ai, Qwen3.7 Max leads the SciCode benchmark with a score of 53.5%, followed by Gemini 3.5 Flash (53.1%) and Kimi K2.6 (52.2%). The top models are clustered within 1.3 points, suggesting this benchmark is nearing saturation for frontier models.
9 models have been evaluated on SciCode. The benchmark falls in the Coding category. This category carries a 20% weight in BenchLM.ai's overall scoring system. Within that category, SciCode contributes 10% of the category score, so strong performance here directly affects a model's overall ranking.
Year
2024
Tasks
80
Version
SciCode 2024
Refresh cadence
Annual
Staleness state
Refreshing
Question availability
Public benchmark set
BenchLM uses freshness metadata to decide whether a benchmark should still be treated as a strong differentiator, a benchmark to watch, or a display-only reference. For the full scoring policy, see the BenchLM methodology page.
SciCode evaluates language models on generating code for realistic scientific research problems across 16 subfields of physics, math, chemistry, biology, and material science. Problems decompose into 338 subproblems requiring domain knowledge recall, scientific reasoning, and precise code synthesis. Based on real scripts from published research.
Qwen3.7 Max by Alibaba currently leads with a score of 53.5% on SciCode.
9 AI models have been evaluated on SciCode on BenchLM.
For engineers, researchers, and the plain curious — a weekly brief on new models, ranking shifts, and pricing changes.
Free. No spam. Unsubscribe anytime.