Multiple Positional Self-Attention Network for Text Classification


  • Biyun Dai University of Science and Technology of China
  • Jinlong Li University of Science and Technology of China
  • Ruoyi Xu University of Science and Technology of China



Self-attention mechanisms have recently caused many concerns on Natural Language Processing (NLP) tasks. Relative positional information is important to self-attention mechanisms. We propose Faraway Mask focusing on the (2m + 1)-gram words and Scaled-Distance Mask putting the logarithmic distance punishment to avoid and weaken the self-attention of distant words respectively. To exploit different masks, we present Positional Self-Attention Layer for generating different Masked-Self-Attentions and a following Position-Fusion Layer in which fused positional information multiplies the Masked-Self-Attentions for generating sentence embeddings. To evaluate our sentence embeddings approach Multiple Positional Self-Attention Network (MPSAN), we perform the comparison experiments on sentiment analysis, semantic relatedness and sentence classification tasks. The result shows that our MPSAN outperforms state-of-the-art methods on five datasets and the test accuracy is improved by 0.81%, 0.6% on SST, CR datasets, respectively. In addition, we reduce training parameters and improve the time efficiency of MPSAN by lowering the dimension number of self-attention and simplifying fusion mechanism.




How to Cite

Dai, B., Li, J., & Xu, R. (2020). Multiple Positional Self-Attention Network for Text Classification. Proceedings of the AAAI Conference on Artificial Intelligence, 34(05), 7610-7617.



AAAI Technical Track: Natural Language Processing