Sarvam 105B Benchmark Scores & Performance

Benchmark analysis of Sarvam 105B by Sarvam across 12 sourced tests on BenchLM.

According to BenchLM.ai, Sarvam 105B ranks #53 out of 104 models with an overall score of 60/100. While not a frontier model, it offers specific advantages depending on the use case.

Sarvam 105B is a open weight model with a 128K token context window. It uses explicit chain-of-thought reasoning, which typically improves performance on math and complex reasoning tasks at the cost of higher latency and token usage.

This profile currently has 12 of 126 tracked benchmarks. BenchLM only exposes verified benchmark rows publicly, so missing categories stay blank until a sourced evaluation is available.

Its strongest category is Mathematics (#20), while its weakest is Instruction Following (#42). This performance profile makes it particularly strong for mathematical reasoning, scientific computing, and quantitative analysis.

Provider

Sarvam

Source Type

Open Weight

Reasoning

Reasoning

Context Window

128K

Model Status

Current

Release Date

Mar 6, 2026

Overall Score

60#53 of 104

Pricing

$0.00 / $0.00

Input / output per 1M

Runtime

N/A

Latency unavailable

Knowledge Benchmarks

MMLUStaleSaturatedDisplay onlyDetails
90.6%

MMLU · Static refresh · updated April 3, 2026

MMLU-ProRefreshingDetails
81.7%

MMLU-Pro · Static refresh · updated April 3, 2026

Coding Benchmarks

LiveCodeBench v6CurrentDisplay onlyDetails
71.7%

LiveCodeBench v6 2026 · Quarterly refresh · updated April 3, 2026

SWE-bench VerifiedRefreshingDetails
45%

SWE-bench Verified 2024 · Annual refresh · updated April 3, 2026

Mathematics Benchmarks

MATH-500StaleDetails
98.6%

MATH-500 2021 · Static refresh · updated April 3, 2026

AIME 2025CurrentDetails
88.3%

AIME 2025 · Annual refresh · updated April 3, 2026

HMMT Feb 2025CurrentDisplay onlyDetails
85.8%

HMMT Feb 2025 2025 · Quarterly refresh · updated April 3, 2026

HMMT Nov 2025CurrentDisplay onlyDetails
85.8%

HMMT Nov 2025 2025 · Quarterly refresh · updated April 3, 2026

Reasoning Benchmarks

gpqaDiamondRefreshingDisplay onlyDetails
78.7%

gpqaDiamond · Static refresh · updated April 3, 2026

hleCurrentDisplay onlyDetails
11.2%

Humanity's Last Exam · Static refresh · updated April 3, 2026

Agentic Benchmarks

BrowseCompCurrentDetails
49.5%

BrowseComp 2026 · Quarterly refresh · updated April 3, 2026

Instruction Following Benchmarks

IFEvalStaleDetails
84.8%

IFEval 2023 · Static refresh · updated April 3, 2026

Frequently Asked Questions

How does Sarvam 105B perform overall in AI benchmarks?

Sarvam 105B ranks #53 out of 104 models with an overall score of 60. It is created by Sarvam and features a 128K context window.

Is Sarvam 105B good for knowledge and understanding?

Sarvam 105B has visible benchmark coverage in knowledge and understanding, but BenchLM does not currently assign it a global category rank there.

Is Sarvam 105B good for coding and programming?

Sarvam 105B has visible benchmark coverage in coding and programming, but BenchLM does not currently assign it a global category rank there.

Is Sarvam 105B good for mathematics?

Sarvam 105B ranks #20 out of 104 models in mathematics benchmarks with an average score of 92.3. There are stronger options in this category.

Is Sarvam 105B good for reasoning and logic?

Sarvam 105B has visible benchmark coverage in reasoning and logic, but BenchLM does not currently assign it a global category rank there.

Is Sarvam 105B good for agentic tool use and computer tasks?

Sarvam 105B has visible benchmark coverage in agentic tool use and computer tasks, but BenchLM does not currently assign it a global category rank there.

Is Sarvam 105B good for instruction following?

Sarvam 105B ranks #42 out of 104 models in instruction following benchmarks with an average score of 84.8. There are stronger options in this category.

Is Sarvam 105B open source?

Yes, Sarvam 105B is an open weight model created by Sarvam, meaning it can be downloaded and run locally or fine-tuned for specific use cases.

Does Sarvam 105B have full benchmark coverage on BenchLM?

Not yet. Sarvam 105B currently has 12 verified benchmark scores out of the 126 benchmarks BenchLM tracks. BenchLM only exposes verified public benchmark rows, so missing categories stay blank until a sourced evaluation is available.

What is the context window size of Sarvam 105B?

Sarvam 105B has a context window of 128K, which determines how much text it can process in a single interaction.

Last updated: April 3, 2026 · Runtime metrics stay blank until BenchLM has a sourced snapshot.

Weekly LLM Updates

New model releases, benchmark scores, and leaderboard changes. Every Friday.

Free. Your signup is stored with a derived country code for compliance routing.

More from Sarvam