Neural Bag-of-Ngrams

Authors

  • Bofang Li Renmin University of China
  • Tao Liu Renmin University of China
  • Zhe Zhao Renmin University of China
  • Puwei Wang Renmin University of China
  • Xiaoyong Du Renmin University of China

DOI:

https://doi.org/10.1609/aaai.v31i1.10954

Abstract

Bag-of-ngrams (BoN) models are commonly used for representing text. One of the main drawbacks of traditional BoN is the ignorance of n-gram's semantics. In this paper, we introduce the concept of Neural Bag-of-ngrams (Neural-BoN), which replaces sparse one-hot n-gram representation in traditional BoN with dense and rich-semantic n-gram representations. We first propose context guided n-gram representation by adding n-grams to word embeddings model. However, the context guided learning strategy of word embeddings is likely to miss some semantics for text-level tasks. Text guided n-gram representation and label guided n-gram representation are proposed to capture more semantics like topic or sentiment tendencies. Neural-BoN with the latter two n-gram representations achieve state-of-the-art results on 4 document-level classification datasets and 6 semantic relatedness categories. They are also on par with some sophisticated DNNs on 3 sentence-level classification datasets. Similar to traditional BoN, Neural-BoN is efficient, robust and easy to implement. We expect it to be a strong baseline and be used in more real-world applications.

Downloads

Published

2017-02-12

How to Cite

Li, B., Liu, T., Zhao, Z., Wang, P., & Du, X. (2017). Neural Bag-of-Ngrams. Proceedings of the AAAI Conference on Artificial Intelligence, 31(1). https://doi.org/10.1609/aaai.v31i1.10954

Issue

Section

Main Track: NLP and Knowledge Representation