ACT: an Attentive Convolutional Transformer for Efficient Text Classification

Authors

  • Pengfei Li Nanyang Technological University, Singapore
  • Peixiang Zhong Nanyang Technological University, Singapore
  • Kezhi Mao Nanyang Technological University, Singapore
  • Dongzhe Wang Zhuiyi Technology, Shenzhen, China
  • Xuefeng Yang Zhuiyi Technology, Shenzhen, China
  • Yunfeng Liu Zhuiyi Technology, Shenzhen, China
  • Jianxiong Yin NVIDIA AI Tech Center
  • Simon See NVIDIA AI Tech Center

Keywords:

Text Classification & Sentiment Analysis, Representation Learning, Classification and Regression, General

Abstract

Recently, Transformer has been demonstrating promising performance in many NLP tasks and showing a trend of replacing Recurrent Neural Network (RNN). Meanwhile, less attention is drawn to Convolutional Neural Network (CNN) due to its weak ability in capturing sequential and long-distance dependencies, although it has excellent local feature extraction capability. In this paper, we introduce an Attentive Convolutional Transformer (ACT) that takes the advantages of both Transformer and CNN for efficient text classification. Specifically, we propose a novel attentive convolution mechanism that utilizes the semantic meaning of convolutional filters attentively to transform text from complex word space to a more informative convolutional filter space where important n-grams are captured. ACT is able to capture both local and global dependencies effectively while preserving sequential information. Experiments on various text classification tasks and detailed analyses show that ACT is a lightweight, fast, and effective universal text classifier, outperforming CNNs, RNNs, and attentive models including Transformer.

Downloads

Published

2021-05-18

How to Cite

Li, P., Zhong, P., Mao, K., Wang, D., Yang, X., Liu, Y., Yin, J., & See, S. (2021). ACT: an Attentive Convolutional Transformer for Efficient Text Classification. Proceedings of the AAAI Conference on Artificial Intelligence, 35(15), 13261-13269. Retrieved from https://ojs.aaai.org/index.php/AAAI/article/view/17566

Issue

Section

AAAI Technical Track on Speech and Natural Language Processing II