Lexically Constrained Neural Machine Translation with Explicit Alignment Guidance

Authors

  • Guanhua Chen The University of Hong Kong
  • Yun Chen Shanghai University of Finance and Economics
  • Victor O.K. Li The University of Hong Kong

Keywords:

Machine Translation & Multilinguality

Abstract

Lexically constrained neural machine translation (NMT), which leverages pre-specified translation to constrain NMT, has practical significance in interactive translation and NMT domain adaption. Previous work either modify the decoding algorithm or train the model on augmented dataset. These methods suffer from either high computational overheads or low copying success rates. In this paper, we investigate Att-Input and Att-Output, two alignment-based constrained decoding methods. These two methods revise the target tokens during decoding based on word alignments derived from encoder-decoder attention weights. Our study shows that Att-Input translates better while Att-Output is more computationally efficient. Capitalizing on both strengths, we further propose EAM-Output by introducing an explicit alignment module (EAM) to a pretrained Transformer. It decodes similarly as EAM-Output, except using alignments derived from the EAM. We leverage the word alignments induced from Att-Input as labels and train the EAM while keeping the parameters of the Transformer frozen. Experiments on WMT16 De-En and WMT16 Ro-En show the effectiveness of our approaches on constrained NMT. In particular, the proposed EAM-Output method consistently outperforms previous approaches in translation quality, with light computational overheads over unconstrained baseline.

Downloads

Published

2021-05-18

How to Cite

Chen, G., Chen, Y., & Li, V. O. (2021). Lexically Constrained Neural Machine Translation with Explicit Alignment Guidance. Proceedings of the AAAI Conference on Artificial Intelligence, 35(14), 12630-12638. Retrieved from https://ojs.aaai.org/index.php/AAAI/article/view/17496

Issue

Section

AAAI Technical Track on Speech and Natural Language Processing I