GPT-5.1-Codex-Max vs MiMo-V2-Flash

Side-by-side benchmark comparison across knowledge, coding, math, and reasoning.

Quick Verdict

GPT-5.1-Codex-Max wins overall with a score of 77 vs 63 (14 point difference).GPT-5.1-Codex-Max wins 4 out of 4 categories.

Knowledge

GPT-5.1-Codex-Max

GPT-5.1-Codex-Max

95

MiMo-V2-Flash

76.8

98
MMLU
79
96
GPQA
78
94
SuperGPQA
76
92
OpenBookQA
74

Coding

GPT-5.1-Codex-Max

GPT-5.1-Codex-Max

94

MiMo-V2-Flash

71

94
HumanEval
71

Mathematics

GPT-5.1-Codex-Max

GPT-5.1-Codex-Max

97.1

MiMo-V2-Flash

78

99
AIME 2023
79
99
AIME 2024
81
98
AIME 2025
80
95
HMMT Feb 2023
75
97
HMMT Feb 2024
77
96
HMMT Feb 2025
76
96
BRUMO 2025
78

Reasoning

GPT-5.1-Codex-Max

GPT-5.1-Codex-Max

93

MiMo-V2-Flash

75

94
SimpleQA
76
92
MuSR
74

Frequently Asked Questions

Which is better, GPT-5.1-Codex-Max or MiMo-V2-Flash?

GPT-5.1-Codex-Max scores higher overall with 77 vs 63, a difference of 14 points across all benchmarks.

Which is better for knowledge tasks, GPT-5.1-Codex-Max or MiMo-V2-Flash?

GPT-5.1-Codex-Max leads in knowledge tasks with an average score of 95 vs 76.8.

Which is better for coding, GPT-5.1-Codex-Max or MiMo-V2-Flash?

GPT-5.1-Codex-Max leads in coding with an average score of 94 vs 71.

Which is better for math, GPT-5.1-Codex-Max or MiMo-V2-Flash?

GPT-5.1-Codex-Max leads in math with an average score of 97.1 vs 78.

Which is better for reasoning, GPT-5.1-Codex-Max or MiMo-V2-Flash?

GPT-5.1-Codex-Max leads in reasoning with an average score of 93 vs 75.