An OpenClaw-derived agent benchmark covering practical work and life tasks such as office document delivery, research, planning, and code maintenance.
As of March 2026, MiniMax M2.7 leads the MM-ClawBench leaderboard with 62.7%.
Year
2026
Tasks
OpenClaw-style real-world tasks
Format
Agent workflow evaluation
Difficulty
Broad real-world agentic execution
MiniMax built MM-ClawBench from commonly used OpenClaw tasks to evaluate how well models handle broad real-world agent scenarios across work and personal productivity.
MiniMax M2.7: Early Echoes of Self-EvolutionAn OpenClaw-derived agent benchmark covering practical work and life tasks such as office document delivery, research, planning, and code maintenance.
MiniMax M2.7 by MiniMax currently leads with a score of 62.7% on MM-ClawBench.
1 AI models have been evaluated on MM-ClawBench on BenchLM.
Get notified when new models drop, benchmark scores change, or the leaderboard shifts. One email per week.
Free. No spam. Unsubscribe anytime. We only store derived location metadata for consent routing.