From ‘F’ to ‘A’ on the N.Y. Regents Science Exams: An Overview of the Aristo Project

Peter Clark; Oren Etzioni; Tushar Khot; Daniel Khashabi; Bhavana Mishra; Kyle Richardson; Ashish Sabharwal; Carissa Schoenick; Carissa  Schoenick; Oyvind  Tafjord; Niket Tandon; Sumithra Bhakthavatsalam; Dirk Groeneveld; Michal Guerquin; Michael Schmitz

doi:10.1609/aimag.v41i4.5304

From ‘F’ to ‘A’ on the N.Y. Regents Science Exams: An Overview of the Aristo Project

Authors

Peter Clark Allen Institute for AI
Oren Etzioni Allen Institute for AI
Tushar Khot Allen Institute for AI
Daniel Khashabi Allen Institute for AI
Bhavana Dalvi Mishra Allen Institute for AI
Kyle Richardson Allen Institute for AI
Ashish Sabharwal Allen Institute for AI
Carissa Schoenick Allen Institute for AI
Carissa Schoenick Allen Institute for AI
Oyvind Tafjord Allen Institute for AI
Niket Tandon Allen Institute for AI
Sumithra Bhakthavatsalam Allen Institute for AI
Dirk Groeneveld Allen Institute for AI
Michal Guerquin Allen Institute for AI
Michael Schmitz Allen Institute for AI

DOI:

https://doi.org/10.1609/aimag.v41i4.5304

Abstract

AI has achieved remarkable mastery over games such as Chess, Go, and Poker, and even Jeopardy!, but the rich variety of standardized exams has remained a landmark challenge. Even as recently as 2016, the best AI system could achieve merely 59.3 percent on an 8th grade science exam. This article reports success on the Grade 8 New York Regents Science Exam, where for the first time a system scores more than 90 percent on the exam’s nondiagram, multiple choice (NDMC) questions. In addition, our Aristo system, building upon the success of recent language models, exceeded 83 percent on the corresponding Grade 12 Science Exam NDMC questions. The results, on unseen test questions, are robust across different test years and different variations of this kind of test. They demonstrate that modern natural language processing methods can result in mastery on this task. While not a full solution to general question-answering (the questions are limited to 8th grade multiple-choice science) it represents a significant milestone for the field.

Downloads

Published

2020-12-28

How to Cite

Clark, P., Etzioni, O., Khot, T., Khashabi, D., Mishra, B., Richardson, K., Sabharwal, A., Schoenick, C., Schoenick, C., Tafjord, O. ., Tandon, N., Bhakthavatsalam, S., Groeneveld, D., Guerquin, M., & Schmitz, M. (2020). From ‘F’ to ‘A’ on the N.Y. Regents Science Exams: An Overview of the Aristo Project. AI Magazine, 41(4), 39-53. https://doi.org/10.1609/aimag.v41i4.5304

Download Citation

Issue

Vol. 41 No. 4: Winter 2020

Section

Special Topic Articles

License

This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.

Authors who publish with this journal agree to the following terms:

The author(s) warrants that they are the sole author and owner of the copyright in the above article/paper, except for those portions shown to be in quotations; that the article/paper is original throughout; and that the undersigned right to make the grants set forth above is complete and unencumbered.
The author(s) agree that if anyone brings any claim or action alleging facts that, if true, constitute a breach of any of the foregoing warranties, the author(s) will hold harmless and indemnify AAAI, their grantees, their licensees, and their distributors against any liability, whether under judgment, decree, or compromise, and any legal fees and expenses arising out of that claim or actions, and the undersigned will cooperate fully in any defense AAAI may make to such claim or action. Moreover, the undersigned agrees to cooperate in any claim or other action seeking to protect or enforce any right the undersigned has granted to AAAI in the article/paper. If any such claim or action fails because of facts that constitute a breach of any of the foregoing warranties, the undersigned agrees to reimburse whomever brings such claim or action for expenses and attorneys’ fees incurred therein.
Author(s) retain all proprietary rights other than copyright (such as patent rights).
Author(s) may make personal reuse of all or portions of the above article/paper in other works of their own authorship.
Author(s) may reproduce, or have reproduced, their article/paper for the author’s personal use, or for company use provided that original work is property cited, and that the copies are not used in a way that implies AAAI endorsement of a product or service of an employer, and that the copies per se are not offered for sale. The foregoing right shall not permit the posting of the article/paper in electronic or digital form on any computer network, except by the author or the author’s employer, and then only on the author’s or the employer’s own web page or ftp site. Such web page or ftp site, in addition to the aforementioned requirements of this Paragraph, must provide an electronic reference or link back to the AAAI electronic server, and shall not post other AAAI copyrighted materials not of the author’s or the employer’s creation (including tables of contents with links to other papers) without AAAI’s written permission.
Author(s) may make limited distribution of all or portions of their article/paper prior to publication.
In the case of work performed under U.S. Government contract, AAAI grants the U.S. Government royalty-free permission to reproduce all or portions of the above article/paper, and to authorize others to do so, for U.S. Government purposes.
In the event the above article/paper is not accepted and published by AAAI, or is withdrawn by the author(s) before acceptance by AAAI, this agreement becomes null and void.

From ‘F’ to ‘A’ on the N.Y. Regents Science Exams: An Overview of the Aristo Project

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

License

Information

Developed By