Side-by-side benchmark comparison across knowledge, coding, math, and reasoning.
Both models are tied with an overall score of 70.
GPT-5 (medium)
88
Qwen3.5 397B (Reasoning)
88
GPT-5 (medium)
83
Qwen3.5 397B (Reasoning)
83
GPT-5 (medium)
92
Qwen3.5 397B (Reasoning)
92
GPT-5 (medium)
86
Qwen3.5 397B (Reasoning)
86
GPT-5 (medium) and Qwen3.5 397B (Reasoning) are tied with identical overall scores of 70.
GPT-5 (medium) and Qwen3.5 397B (Reasoning) are tied for knowledge tasks with average scores of 88.
GPT-5 (medium) and Qwen3.5 397B (Reasoning) are tied for coding with average scores of 83.
GPT-5 (medium) and Qwen3.5 397B (Reasoning) are tied for math with average scores of 92.
GPT-5 (medium) and Qwen3.5 397B (Reasoning) are tied for reasoning with average scores of 86.