Knowledge-aware Leap-LSTM: Integrating Prior Knowledge into Leap-LSTM towards Faster Long Text Classification

Authors

  • Jinhua Du Investments AI, AIG
  • Yan Huang Investments AI, AIG
  • Karo Moilanen Investments AI, AIG

DOI:

https://doi.org/10.1609/aaai.v35i14.17511

Keywords:

Applications, Text Classification & Sentiment Analysis, (Deep) Neural Network Algorithms, Classification and Regression

Abstract

While widely used in industry, recurrent neural networks (RNNs) are known to have deficiencies in dealing with long sequences (e.g. slow inference, vanishing gradients etc.). Recent research has attempted to accelerate RNN models by developing mechanisms to skip irrelevant words in input. Due to the lack of labelled data, it remains as a challenge to decide which words to skip, especially for low-resource classification tasks. In this paper, we propose Knowledge-AwareLeap-LSTM (KALL), a novel architecture which integrates prior human knowledge (created either manually or automatically) like in-domain keywords, terminologies or lexicons into Leap-LSTM to partially supervise the skipping process. More specifically, we propose a knowledge-oriented cost function for KALL; furthermore, we propose two strategies to integrate the knowledge: (1) the Factored KALL approach involves a keyword indicator as a soft constraint for the skip-ping process, and (2) the Gated KALL enforces the inclusion of keywords while maintaining a differentiable network in training. Experiments on different public datasets show that our approaches are1.1x~2.6x faster than LSTM with better accuracy and 23.6x faster than XLNet in a resource-limited CPU-only environment.

Downloads

Published

2021-05-18

How to Cite

Du, J., Huang, Y., & Moilanen, K. (2021). Knowledge-aware Leap-LSTM: Integrating Prior Knowledge into Leap-LSTM towards Faster Long Text Classification. Proceedings of the AAAI Conference on Artificial Intelligence, 35(14), 12768-12775. https://doi.org/10.1609/aaai.v35i14.17511

Issue

Section

AAAI Technical Track on Speech and Natural Language Processing I