Side-by-side benchmark comparison across knowledge, coding, math, and reasoning.
Moonshot v1 wins overall with a score of 44 vs 25 (19 point difference).Moonshot v1 wins 4 out of 4 categories.
DeepSeek V3.1 (Reasoning)
31.8
Moonshot v1
50.8
DeepSeek V3.1 (Reasoning)
26
Moonshot v1
45
DeepSeek V3.1 (Reasoning)
33
Moonshot v1
52
DeepSeek V3.1 (Reasoning)
31
Moonshot v1
50
Moonshot v1 scores higher overall with 44 vs 25, a difference of 19 points across all benchmarks.
Moonshot v1 leads in knowledge tasks with an average score of 50.8 vs 31.8.
Moonshot v1 leads in coding with an average score of 45 vs 26.
Moonshot v1 leads in math with an average score of 52 vs 33.
Moonshot v1 leads in reasoning with an average score of 50 vs 31.