Side-by-side benchmark comparison across knowledge, coding, math, and reasoning.
Gemini 3 Pro Deep Think wins overall with a score of 81 vs 39 (42 point difference).Gemini 3 Pro Deep Think wins 4 out of 4 categories.
Gemini 3 Pro Deep Think
96
Llama 4 Behemoth
45.8
Gemini 3 Pro Deep Think
91
Llama 4 Behemoth
40
Gemini 3 Pro Deep Think
97.1
Llama 4 Behemoth
47
Gemini 3 Pro Deep Think
94
Llama 4 Behemoth
45
Gemini 3 Pro Deep Think scores higher overall with 81 vs 39, a difference of 42 points across all benchmarks.
Gemini 3 Pro Deep Think leads in knowledge tasks with an average score of 96 vs 45.8.
Gemini 3 Pro Deep Think leads in coding with an average score of 91 vs 40.
Gemini 3 Pro Deep Think leads in math with an average score of 97.1 vs 47.
Gemini 3 Pro Deep Think leads in reasoning with an average score of 94 vs 45.