Gaussian Transformer: A Lightweight Approach for Natural Language Inference


  • Maosheng Guo Harbin Institute of Technology
  • Yu Zhang Harbin Institute of Technology
  • Ting Liu Harbin Institute of Technology



Natural Language Inference (NLI) is an active research area, where numerous approaches based on recurrent neural networks (RNNs), convolutional neural networks (CNNs), and self-attention networks (SANs) has been proposed. Although obtaining impressive performance, previous recurrent approaches are hard to train in parallel; convolutional models tend to cost more parameters, while self-attention networks are not good at capturing local dependency of texts. To address this problem, we introduce a Gaussian prior to selfattention mechanism, for better modeling the local structure of sentences. Then we propose an efficient RNN/CNN-free architecture named Gaussian Transformer for NLI, which consists of encoding blocks modeling both local and global dependency, high-order interaction blocks collecting the evidence of multi-step inference, and a lightweight comparison block saving lots of parameters. Experiments show that our model achieves new state-of-the-art performance on both SNLI and MultiNLI benchmarks with significantly fewer parameters and considerably less training time. Besides, evaluation using the Hard NLI datasets demonstrates that our approach is less affected by the undesirable annotation artifacts.




How to Cite

Guo, M., Zhang, Y., & Liu, T. (2019). Gaussian Transformer: A Lightweight Approach for Natural Language Inference. Proceedings of the AAAI Conference on Artificial Intelligence, 33(01), 6489-6496.



AAAI Technical Track: Natural Language Processing