Transductive Ensemble Learning for Neural Machine Translation

Yiren Wang; Lijun Wu; Yingce Xia; Tao Qin; ChengXiang Zhai; Tie-Yan Liu

doi:10.1609/aaai.v34i04.6097

Authors

Yiren Wang University of Illinois at Urbana-Champaign
Lijun Wu Sun Yat-sen University
Yingce Xia Microsoft Research Asia
Tao Qin Microsoft Research Asia
ChengXiang Zhai University of Illinois at Urbana-Champaign
Tie-Yan Liu Microsoft Research Asia

DOI:

https://doi.org/10.1609/aaai.v34i04.6097

Abstract

Ensemble learning, which aggregates multiple diverse models for inference, is a common practice to improve the accuracy of machine learning tasks. However, it has been observed that the conventional ensemble methods only bring marginal improvement for neural machine translation (NMT) when individual models are strong or there are a large number of individual models. In this paper, we study how to effectively aggregate multiple NMT models under the transductive setting where the source sentences of the test set are known. We propose a simple yet effective approach named transductive ensemble learning (TEL), in which we use all individual models to translate the source test set into the target language space and then finetune a strong model on the translated synthetic corpus. We conduct extensive experiments on different settings (with/without monolingual data) and different language pairs (English↔{German, Finnish}). The results show that our approach boosts strong individual models with significant improvement and benefits a lot from more individual models. Specifically, we achieve the state-of-the-art performances on the WMT2016-2018 English↔German translations.

Transductive Ensemble Learning for Neural Machine Translation

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Subscription