A Generalized Language Model in Tensor Space

Authors

  • Lipeng Zhang Tianjin University
  • Peng Zhang Tianjin University
  • Xindian Ma Tianjin University
  • Shuqin Gu Tianjin University
  • Zhan Su Tianjin University
  • Dawei Song Beijing Institue of Technology

DOI:

https://doi.org/10.1609/aaai.v33i01.33017450

Abstract

In the literature, tensors have been effectively used for capturing the context information in language models. However, the existing methods usually adopt relatively-low order tensors, which have limited expressive power in modeling language. Developing a higher-order tensor representation is challenging, in terms of deriving an effective solution and showing its generality. In this paper, we propose a language model named Tensor Space Language Model (TSLM), by utilizing tensor networks and tensor decomposition. In TSLM, we build a high-dimensional semantic space constructed by the tensor product of word vectors. Theoretically, we prove that such tensor representation is a generalization of the n-gram language model. We further show that this high-order tensor representation can be decomposed to a recursive calculation of conditional probability for language modeling. The experimental results on Penn Tree Bank (PTB) dataset and WikiText benchmark demonstrate the effectiveness of TSLM.

Downloads

Published

2019-07-17

How to Cite

Zhang, L., Zhang, P., Ma, X., Gu, S., Su, Z., & Song, D. (2019). A Generalized Language Model in Tensor Space. Proceedings of the AAAI Conference on Artificial Intelligence, 33(01), 7450-7458. https://doi.org/10.1609/aaai.v33i01.33017450

Issue

Section

AAAI Technical Track: Natural Language Processing