Agreement on Target-Bidirectional LSTMs for Sequence-to-Sequence Learning

Authors

  • Lemao Liu National Institute of Information and Communications Technology
  • Andrew Finch National Institute of Information and Communications Technology
  • Masao Utiyama National Institute of Information and Communications Technology
  • Eiichiro Sumita National Institute of Information and Communications Technology

DOI:

https://doi.org/10.1609/aaai.v30i1.10327

Abstract

Recurrent neural networks, particularly the long short- term memory networks, are extremely appealing for sequence-to-sequence learning tasks. Despite their great success, they typically suffer from a fundamental short- coming: they are prone to generate unbalanced targets with good prefixes but bad suffixes, and thus perfor- mance suffers when dealing with long sequences. We propose a simple yet effective approach to overcome this shortcoming. Our approach relies on the agreement between a pair of target-directional LSTMs, which generates more balanced targets. In addition, we develop two efficient approximate search methods for agreement that are empirically shown to be almost optimal in terms of sequence-level losses. Extensive experiments were performed on two standard sequence-to-sequence trans- duction tasks: machine transliteration and grapheme-to- phoneme transformation. The results show that the proposed approach achieves consistent and substantial im- provements, compared to six state-of-the-art systems. In particular, our approach outperforms the best reported error rates by a margin (up to 9% relative gains) on the grapheme-to-phoneme task.

Downloads

Published

2016-03-05

How to Cite

Liu, L., Finch, A., Utiyama, M., & Sumita, E. (2016). Agreement on Target-Bidirectional LSTMs for Sequence-to-Sequence Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 30(1). https://doi.org/10.1609/aaai.v30i1.10327

Issue

Section

Technical Papers: NLP and Knowledge Representation