A question-answering dataset modeled after open book exams for assessing human understanding of a subject. Requires combining facts from a knowledge base with broad common sense reasoning.
Year
2018
Tasks
Open book questions
Format
Multiple choice questions
Difficulty
Elementary science level
OpenBookQA tests the ability to combine explicit knowledge with implicit common sense reasoning. Each question requires understanding scientific facts and applying them to novel situations, mimicking real open-book examinations.
Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question AnsweringA question-answering dataset modeled after open book exams for assessing human understanding of a subject. Requires combining facts from a knowledge base with broad common sense reasoning.
GPT-5.4 by OpenAI currently leads with a score of 93 on OpenBookQA.
88 AI models have been evaluated on OpenBookQA on BenchLM.