A comprehensive multiple-choice question answering test covering 57 tasks including elementary mathematics, US history, computer science, law, and more. Tests knowledge across diverse academic subjects from high school to professional level.
Year
2020
Tasks
57 subjects
Format
Multiple choice questions
Difficulty
Elementary to professional level
MMLU evaluates models on 57 subjects spanning humanities, social sciences, STEM, and other areas. Questions range from elementary to advanced professional level, making it a comprehensive test of world knowledge and reasoning ability.
Measuring Massive Multitask Language UnderstandingA comprehensive multiple-choice question answering test covering 57 tasks including elementary mathematics, US history, computer science, law, and more. Tests knowledge across diverse academic subjects from high school to professional level.
GPT-5.4 by OpenAI currently leads with a score of 99 on MMLU.
88 AI models have been evaluated on MMLU on BenchLM.