NL2Repo (NL2Repo)

A repository-understanding benchmark that measures whether models can map natural-language requests onto the right code locations and system changes.

Top Models on NL2Repo — March 2026

As of March 2026, MiniMax M2.7 leads the NL2Repo leaderboard with 39.8%.

1 modelsCodingUpdated March 18, 2026

About NL2Repo

Year

2026

Tasks

Natural language to repository tasks

Format

Repository understanding benchmark

Difficulty

System-level software comprehension

MiniMax cites NL2Repo as a system-level engineering benchmark that rewards deep understanding of complex repositories and their operational structure.

MiniMax M2.7: Early Echoes of Self-Evolution

Leaderboard (1 models)

#1MiniMax M2.7
39.8%

FAQ

What does NL2Repo measure?

A repository-understanding benchmark that measures whether models can map natural-language requests onto the right code locations and system changes.

Which model scores highest on NL2Repo?

MiniMax M2.7 by MiniMax currently leads with a score of 39.8% on NL2Repo.

How many models are evaluated on NL2Repo?

1 AI models have been evaluated on NL2Repo on BenchLM.

Last updated: March 18, 2026

Weekly LLM Benchmark Digest

Get notified when new models drop, benchmark scores change, or the leaderboard shifts. One email per week.

Free. No spam. Unsubscribe anytime. We only store derived location metadata for consent routing.