Name: Llama 4 Scout
Rating: 22 (17 reviews)
Author: Meta

Question 1

How does Llama 4 Scout perform overall in AI benchmarks?

Accepted Answer

Llama 4 Scout currently ranks #108 out of 119 models on BenchLM's provisional leaderboard with an overall score of 22 (estimated). It is created by Meta and features a 10M context window.

Question 2

Is Llama 4 Scout good for knowledge and understanding?

Accepted Answer

Llama 4 Scout ranks #94 out of 119 models in knowledge and understanding benchmarks with an average score of 15.5. There are stronger options in this category.

Question 3

Is Llama 4 Scout good for coding and programming?

Accepted Answer

Llama 4 Scout ranks #95 out of 119 models in coding and programming benchmarks with an average score of 2.6. There are stronger options in this category.

Question 4

Is Llama 4 Scout good for reasoning and logic?

Accepted Answer

Llama 4 Scout ranks #62 out of 119 models in reasoning and logic benchmarks with an average score of 40.9. There are stronger options in this category.

Question 5

Is Llama 4 Scout good for agentic tool use and computer tasks?

Accepted Answer

Llama 4 Scout ranks #89 out of 119 models in agentic tool use and computer tasks benchmarks with an average score of 17.4. There are stronger options in this category.

Question 6

Is Llama 4 Scout good for multimodal and grounded tasks?

Accepted Answer

Llama 4 Scout ranks #78 out of 119 models in multimodal and grounded tasks benchmarks with an average score of 35.9. There are stronger options in this category.

Question 7

Is Llama 4 Scout good for instruction following?

Accepted Answer

Llama 4 Scout ranks #107 out of 119 models in instruction following benchmarks with an average score of 18.8. There are stronger options in this category.

Question 8

Is Llama 4 Scout open source?

Accepted Answer

Yes, Llama 4 Scout is an open weight model created by Meta, meaning it can be downloaded and run locally or fine-tuned for specific use cases.

Question 9

Does Llama 4 Scout have full benchmark coverage on BenchLM?

Accepted Answer

Not yet. Llama 4 Scout currently has 17 published benchmark scores out of the 225 benchmarks BenchLM tracks. BenchLM only exposes non-generated public benchmark rows, so missing categories stay blank until a sourced evaluation is available.

Question 10

What is the context window size of Llama 4 Scout?

Accepted Answer

Llama 4 Scout has a context window of 10M, which determines how much text it can process in a single interaction.

Llama 4 Scout

Self-host vs API cost

Ranking Distribution

Category Performance

Category Breakdown

Agentic

Coding

Reasoning

Knowledge

Math

Multilingual

Multimodal

Inst. Following

Chatbot Arena Performance

Benchmark Details

Compare This Model

Frequently Asked Questions