Side-by-side benchmark comparison across knowledge, coding, math, and reasoning.
Qwen3.5 397B (Reasoning) wins overall with a score of 70 vs 48 (22 point difference).Qwen3.5 397B (Reasoning) wins 4 out of 4 categories.
Llama 3 70B
56.5
Qwen3.5 397B (Reasoning)
88
Llama 3 70B
50
Qwen3.5 397B (Reasoning)
83
Llama 3 70B
57
Qwen3.5 397B (Reasoning)
92
Llama 3 70B
55
Qwen3.5 397B (Reasoning)
86
Qwen3.5 397B (Reasoning) scores higher overall with 70 vs 48, a difference of 22 points across all benchmarks.
Qwen3.5 397B (Reasoning) leads in knowledge tasks with an average score of 88 vs 56.5.
Qwen3.5 397B (Reasoning) leads in coding with an average score of 83 vs 50.
Qwen3.5 397B (Reasoning) leads in math with an average score of 92 vs 57.
Qwen3.5 397B (Reasoning) leads in reasoning with an average score of 86 vs 55.