A grounded multimodal factuality benchmark for evidence-linked answer correctness.
As of March 2026, GLM-5V-Turbo leads the Facts-VLM leaderboard with 58.6% , followed by Kimi K2.5 (57.8%).
GLM-5V-Turbo
Zhipu AI
Kimi K2.5
Moonshot AI
Year
2026
Tasks
Grounded factuality tasks
Format
Evidence-linked multimodal factuality
Difficulty
Grounded multimodal factuality
BenchLM stores Facts-VLM as a display-only benchmark reference when exact provider tables are available.
GLM-5V-TurboVersion
Facts-VLM 2026
Refresh cadence
Quarterly
Staleness state
Current
Question availability
Public benchmark set
BenchLM uses freshness metadata to decide whether a benchmark should still be treated as a strong differentiator, a benchmark to watch, or a display-only reference. For the full scoring policy, see the BenchLM methodology page.
A grounded multimodal factuality benchmark for evidence-linked answer correctness.
GLM-5V-Turbo by Zhipu AI currently leads with a score of 58.6% on Facts-VLM.
2 AI models have been evaluated on Facts-VLM on BenchLM.
Get notified when new models drop, benchmark scores change, or the leaderboard shifts. One email per week.
Free. No spam. Unsubscribe anytime. We only store derived location metadata for consent routing.