A multimodal search benchmark for retrieval and grounded answering across mixed-media inputs.
As of March 2026, GLM-5V-Turbo leads the MMSearch leaderboard with 72.9% , followed by Claude Opus 4.6 (63.8%) and Kimi K2.5 (58.7%).
GLM-5V-Turbo
Zhipu AI
Claude Opus 4.6
Anthropic
Kimi K2.5
Moonshot AI
According to BenchLM.ai, GLM-5V-Turbo leads the MMSearch benchmark with a score of 72.9%, followed by Claude Opus 4.6 (63.8%) and Kimi K2.5 (58.7%). The scores show moderate spread, with meaningful differences between the top tier and mid-tier models.
3 models have been evaluated on MMSearch. The benchmark falls in the Multimodal & Grounded category. This category carries a 12% weight in BenchLM.ai's overall scoring system. MMSearch is currently displayed for reference but excluded from the scoring formula, so it does not directly affect overall rankings.
Year
2026
Tasks
Multimodal search tasks
Format
Mixed-media retrieval and grounded answering
Difficulty
Multimodal search
BenchLM stores MMSearch as a display-only benchmark because it is not yet part of the weighted core schema.
GLM-5V-TurboVersion
MMSearch 2026
Refresh cadence
Quarterly
Staleness state
Current
Question availability
Public benchmark set
BenchLM uses freshness metadata to decide whether a benchmark should still be treated as a strong differentiator, a benchmark to watch, or a display-only reference. For the full scoring policy, see the BenchLM methodology page.
A multimodal search benchmark for retrieval and grounded answering across mixed-media inputs.
GLM-5V-Turbo by Zhipu AI currently leads with a score of 72.9% on MMSearch.
3 AI models have been evaluated on MMSearch on BenchLM.
Get notified when new models drop, benchmark scores change, or the leaderboard shifts. One email per week.
Free. No spam. Unsubscribe anytime. We only store derived location metadata for consent routing.