ACT: an Attentive Convolutional Transformer for Efficient Text Classification
Keywords:Text Classification & Sentiment Analysis, Representation Learning, Classification and Regression, General
AbstractRecently, Transformer has been demonstrating promising performance in many NLP tasks and showing a trend of replacing Recurrent Neural Network (RNN). Meanwhile, less attention is drawn to Convolutional Neural Network (CNN) due to its weak ability in capturing sequential and long-distance dependencies, although it has excellent local feature extraction capability. In this paper, we introduce an Attentive Convolutional Transformer (ACT) that takes the advantages of both Transformer and CNN for efficient text classification. Specifically, we propose a novel attentive convolution mechanism that utilizes the semantic meaning of convolutional filters attentively to transform text from complex word space to a more informative convolutional filter space where important n-grams are captured. ACT is able to capture both local and global dependencies effectively while preserving sequential information. Experiments on various text classification tasks and detailed analyses show that ACT is a lightweight, fast, and effective universal text classifier, outperforming CNNs, RNNs, and attentive models including Transformer.
How to Cite
Li, P., Zhong, P., Mao, K., Wang, D., Yang, X., Liu, Y., Yin, J., & See, S. (2021). ACT: an Attentive Convolutional Transformer for Efficient Text Classification. Proceedings of the AAAI Conference on Artificial Intelligence, 35(15), 13261-13269. Retrieved from https://ojs.aaai.org/index.php/AAAI/article/view/17566
AAAI Technical Track on Speech and Natural Language Processing II