Combining Retrieval, Statistics, and Inference to Answer Elementary Science Questions


  • Peter Clark Allen Institute for AI
  • Oren Etzioni Allen Institute for AI
  • Tushar Khot Allen Institute for AI
  • Ashish Sabharwal Allen Institute for AI
  • Oyvind Tafjord Allen Institute for AI
  • Peter Turney Allen Institute for AI
  • Daniel Khashabi Univ. Illinois at Urbana-Champaign



question answering, natural language processing, machine learning, ensemble methods


What capabilities are required for an AI system to pass standard 4th Grade Science Tests? Previous work has examined the use of Markov Logic Networks (MLNs) to represent the requisite background knowledge and interpret test questions, but did not improve upon an information retrieval (IR) baseline. In this paper, we describe an alternative approach that operates at three levels of representation and reasoning: information retrieval, corpus statistics, and simple inference over a semi-automatically constructed knowledge base, to achieve substantially improved results. We evaluate the methods on six years of unseen, unedited exam questions from the NY Regents Science Exam (using only non-diagram, multiple choice questions), and show that our overall system’s score is 71.3%, an improvement of 23.8% (absolute) over the MLN-based method described in previous work. We conclude with a detailed analysis, illustrating the complementary strengths of each method in the ensemble. Our datasets are being released to enable further research.




How to Cite

Clark, P., Etzioni, O., Khot, T., Sabharwal, A., Tafjord, O., Turney, P., & Khashabi, D. (2016). Combining Retrieval, Statistics, and Inference to Answer Elementary Science Questions. Proceedings of the AAAI Conference on Artificial Intelligence, 30(1).



Technical Papers: NLP and Knowledge Representation