Addressing the Under-Translation Problem from the Entropy Perspective

Authors

  • Yang Zhao Institute of Automation, Chinese Academy of Sciences
  • Jiajun Zhang Institute of Automation Chinese Academy of Sciences
  • Chengqing Zong Institute of Automation, Chinese Academy of Sciences
  • Zhongjun He Baidu, Inc.
  • Hua Wu Baidu, Inc.

DOI:

https://doi.org/10.1609/aaai.v33i01.3301451

Abstract

Neural Machine Translation (NMT) has drawn much attention due to its promising translation performance in recent years. However, the under-translation problem still remains a big challenge. In this paper, we focus on the under-translation problem and attempt to find out what kinds of source words are more likely to be ignored. Through analysis, we observe that a source word with a large translation entropy is more inclined to be dropped. To address this problem, we propose a coarse-to-fine framework. In coarse-grained phase, we introduce a simple strategy to reduce the entropy of highentropy words through constructing the pseudo target sentences. In fine-grained phase, we propose three methods, including pre-training method, multitask method and two-pass method, to encourage the neural model to correctly translate these high-entropy words. Experimental results on various translation tasks show that our method can significantly improve the translation quality and substantially reduce the under-translation cases of high-entropy words.

Downloads

Published

2019-07-17

How to Cite

Zhao, Y., Zhang, J., Zong, C., He, Z., & Wu, H. (2019). Addressing the Under-Translation Problem from the Entropy Perspective. Proceedings of the AAAI Conference on Artificial Intelligence, 33(01), 451-458. https://doi.org/10.1609/aaai.v33i01.3301451

Issue

Section

AAAI Technical Track: AI and the Web