Skip to main content

MMSearch-Plus

A harder MMSearch variant for multimodal retrieval and grounded tool-use workflows.

Benchmark score on MMSearch-Plus — July 1, 2026

BenchLM mirrors the published score view for MMSearch-Plus. Qwen3.7 Plus leads the public snapshot at 41.4%. BenchLM does not use these results to rank models overall.

1 modelsMultimodal & GroundedCurrentDisplay onlyUpdated July 1, 2026

About MMSearch-Plus

Year

2026

Tasks

Hard multimodal search tasks

Format

Advanced mixed-media retrieval benchmark

Difficulty

Advanced multimodal search

BenchLM tracks MMSearch-Plus as a display-only extension of multimodal search capability.

BenchLM freshness & provenance

Version

MMSearch-Plus 2026

Refresh cadence

Quarterly

Staleness state

Current

Question availability

Public benchmark set

CurrentDisplay only

BenchLM uses freshness metadata to decide whether a benchmark should still be treated as a strong differentiator, a benchmark to watch, or a display-only reference. For the full scoring policy, see the BenchLM methodology page.

Benchmark score table (1 models)

1
41.4%

FAQ

What does MMSearch-Plus measure?

A harder MMSearch variant for multimodal retrieval and grounded tool-use workflows.

Which model scores highest on MMSearch-Plus?

Qwen3.7 Plus by Alibaba currently leads with a score of 41.4% on MMSearch-Plus.

How many models are evaluated on MMSearch-Plus?

1 AI models have been evaluated on MMSearch-Plus on BenchLM.

Last updated: July 1, 2026 · BenchLM version MMSearch-Plus 2026

The AI models change fast. We track them for you.

For engineers, researchers, and the plain curious — a weekly brief on new models, ranking shifts, and pricing changes.

Free. No spam. Unsubscribe anytime.