Side-by-side benchmark comparison across knowledge, coding, math, and reasoning.
Both models are tied with an overall score of 58.
Gemini 3 Flash
67.8
Llama 3.1 405B
68.5
Gemini 3 Flash
62
Llama 3.1 405B
62
Gemini 3 Flash
69
Llama 3.1 405B
69
Gemini 3 Flash
66
Llama 3.1 405B
67
Gemini 3 Flash and Llama 3.1 405B are tied with identical overall scores of 58.
Llama 3.1 405B leads in knowledge tasks with an average score of 68.5 vs 67.8.
Gemini 3 Flash and Llama 3.1 405B are tied for coding with average scores of 62.
Gemini 3 Flash and Llama 3.1 405B are tied for math with average scores of 69.
Llama 3.1 405B leads in reasoning with an average score of 67 vs 66.