BenchLM is tracking Grok 4.20 Beta by xAI. Benchmark coverage is coming soon.
BenchLM is tracking Grok 4.20 Beta, but sourced benchmark results are not published on the site yet. This page currently shows the model metadata we can verify now, and score-level benchmark coverage will appear once public evaluations land.
Grok 4.20 Beta is a proprietary model with a 2M token context window. It uses explicit chain-of-thought reasoning, which typically improves performance on math and complex reasoning tasks at the cost of higher latency and token usage.
Grok 4.20 Beta sits inside the Grok 4.20 family alongside Grok 4.20 Multi-agent Beta. BenchLM links it directly to Grok 4.1 as the earlier related model in that lineage. This profile currently has 0 sourced benchmarks on BenchLM, so the benchmark sections below are intentionally marked as coming soon.
Creator
xAI
Source Type
ProprietaryReasoning
ReasoningContext Window
2M
Overall Score
Coming soon
Category rankings are coming soon. BenchLM will populate this section once sourced benchmark results are available for this model.
BenchLM is tracking Grok 4.20 Beta, but sourced benchmark coverage is still coming soon. We currently list its creator, model type, and context window while we wait for public benchmark results.
Grok 4.20 Beta belongs to the Grok 4.20 family. Related variants on BenchLM include Grok 4.20 Multi-agent Beta.
Not yet. Grok 4.20 Beta currently has 0 sourced benchmark scores out of the 32 benchmarks BenchLM tracks, so its overall score is intentionally conservative until more results are added.
Grok 4.20 Beta has a context window of 2M, which determines how much text it can process in a single interaction.
New model releases, benchmark scores, and leaderboard changes. Every Friday.
Free. Your signup is stored with a derived country code for compliance routing.