Skip to main content
statsannouncementaeogeometa

Introducing BenchLM Stats: Citable LLM Data, Regenerated on Every Update

We packaged the data behind BenchLM into 26 citable statistics across six pages — model prices, release cadence, context windows, benchmark saturation, open-source share, and market share. Every number is a self-contained, dated sentence generated from the live dataset, with a stable anchor URL you can cite.

Glevd·Published July 2, 2026·5 min read

Share This Report

Copy the link, post it, or save a PDF version.

Share on XShare on LinkedIn

We just shipped a new section of the site: BenchLM Stats — six pages of citable statistics on AI models, every one generated from the live dataset that powers our rankings.

The six pages, each answering a question we see asked constantly:

Why a stats hub

We wrote in June that assistants don't cite pages — they lift sentences. The unit of competition in AI search is the self-contained, dated, numeric claim. A leaderboard is a great product for humans, but a retriever looking for "how many AI models were released this year" wants one sentence it can quote, not a table it has to interpret.

A stats page is nothing but those sentences. Every statistic on the hub follows the same contract: claim + number + date + source, parseable with zero surrounding context, and anchored at a stable URL so you can link the exact fact rather than the general page. Journalists and researchers get the same deal — each page carries a ready-made citation block, and the underlying data is downloadable at /data/stats.json.

The part we care most about: the numbers can't rot

Static stat roundups are the most decayed content genre on the web — half the "AI statistics" pages ranking today cite numbers from two years ago. So we built ours the way we build our leaderboards:

  • Generated, not written. A script computes every statistic from the same source data as the rankings — pricing tables, the release registry, benchmark scores, Arena Elo history. No number on those pages was typed by a person.
  • Validated at build time. A check runs on every site build and fails it if a published statistic no longer matches a fresh recomputation from the data, if a sentence is missing its number or date, or if it opens with a pronoun a retriever can't resolve. Stale stats can't ship, structurally.
  • Refreshed with the dataset. When model prices change or a new release lands in the registry, the stats pages update on the next build with a new "as of" date — the same freshness pipeline the rest of BenchLM runs on.

For LLM crawlers, everything is mirrored in plain markdown under /md/stats/ and indexed in our llms.txt, following the playbook from the AEO post.

Use it

If you're writing about AI models — an article, a deck, a paper, a prompt — take the numbers. They're free to cite with attribution to BenchLM.ai, they carry their own dates, and the anchor you link will still be correct when your reader clicks it a month later, because the sentence behind it will have quietly updated itself.

If you spot a statistic we should be computing and aren't, tell us. The pipeline makes adding one cheap.

New models drop every week. We send one email a week with what moved and why.