SkipFlow: Incorporating Neural Coherence Features for End-to-End Automatic Text Scoring

Authors

  • Yi Tay Nanyang Technological University
  • Minh Phan Nanyang Technological University
  • Luu Anh Tuan Agency for Science and Technology Research (A*Star),¬†Institute for Infocomm Research
  • Siu Cheung Hui Nanyang Technological University

Keywords:

Essay Grading, Educational AI, LSTM, Deep Learning, ASAP Dataset, Kaggle, Essay

Abstract

Deep learning has demonstrated tremendous potential for Automatic Text Scoring (ATS) tasks. In this paper, we describe a new neural architecture that enhances vanilla neural network models with auxiliary neural coherence features. Our new method proposes a new SkipFlow mechanism that models relationships between snapshots of the hidden representations of a long short-term memory (LSTM) network as it reads. Subsequently, the semantic relationships between multiple snapshots are used as auxiliary features for prediction. This has two main benefits. Firstly, essays are typically long sequences and therefore the memorization capability of the LSTM network may be insufficient. Implicit access to multiple snapshots can alleviate this problem by acting as a protection against vanishing gradients. The parameters of the SkipFlow mechanism also acts as an auxiliary memory. Secondly, modeling relationships between multiple positions allows our model to learn features that represent and approximate textual coherence. In our model, we call this neural coherence features. Overall, we present a unified deep learning architecture that generates neural coherence features as it reads in an end-to-end fashion. Our approach demonstrates state-of-the-art performance on the benchmark ASAP dataset, outperforming not only feature engineering baselines but also other deep learning models.

Downloads

Published

2018-04-26

How to Cite

Tay, Y., Phan, M., Tuan, L. A., & Hui, S. C. (2018). SkipFlow: Incorporating Neural Coherence Features for End-to-End Automatic Text Scoring. Proceedings of the AAAI Conference on Artificial Intelligence, 32(1). Retrieved from https://ojs.aaai.org/index.php/AAAI/article/view/12045