SALNet: Semi-supervised Few-Shot Text Classification with Attention-based Lexicon Construction

Authors

  • Ju-Hyoung Lee Yonsei University
  • Sang-Ki Ko Kangwon National University
  • Yo-Sub Han Yonsei University

DOI:

https://doi.org/10.1609/aaai.v35i14.17558

Keywords:

Text Classification & Sentiment Analysis

Abstract

We propose a semi-supervised bootstrap learning framework for few-shot text classification. From a small amount of the initial dataset, our framework obtains a larger set of reliable training data by using the attention weights from an LSTM-based trained classifier. We first train an LSTM-based text classifier from a given labeled dataset using the attention mechanism. Then, we collect a set of words for each class called a lexicon, which is supposed to be a representative set of words for each class based on the attention weights calculated for the classification task. We bootstrap the classifier using the new data that are labeled by the combination of the classifier and the constructed lexicons to improve the prediction accuracy. As a result, our approach outperforms the previous state-of-the-art methods including semi-supervised learning algorithms and pretraining algorithms for few-shot text classification task on four publicly available benchmark datasets. Moreover, we empirically confirm that the constructed lexicons are reliable enough and substantially improve the performance of the original classifier.

Downloads

Published

2021-05-18

How to Cite

Lee, J.-H., Ko, S.-K., & Han, Y.-S. (2021). SALNet: Semi-supervised Few-Shot Text Classification with Attention-based Lexicon Construction. Proceedings of the AAAI Conference on Artificial Intelligence, 35(14), 13189-13197. https://doi.org/10.1609/aaai.v35i14.17558

Issue

Section

AAAI Technical Track on Speech and Natural Language Processing I