EBBS: An Ensemble with Bi-Level Beam Search for Zero-Shot Machine Translation

Authors

  • Yuqiao Wen Dept. Computing Science, Alberta Machine Intelligence Institute (Amii), University of Alberta
  • Behzad Shayegh Dept. Computing Science, Alberta Machine Intelligence Institute (Amii), University of Alberta
  • Chenyang Huang Dept. Computing Science, Alberta Machine Intelligence Institute (Amii), University of Alberta
  • Yanshuai Cao RBC Borealis
  • Lili Mou Dept. Computing Science, Alberta Machine Intelligence Institute (Amii), University of Alberta Canada CIFAR AI Chair, Amii

DOI:

https://doi.org/10.1609/aaai.v39i24.34737

Abstract

The ability of zero-shot translation emerges when we train a multilingual model with certain translation directions; the model can then directly translate in unseen directions. Alternatively, zero-shot translation can be accomplished by pivoting through a third language (e.g., English). In our work, we observe that both direct and pivot translations are noisy and achieve less satisfactory performance. We propose EBBS, an ensemble method with a novel bi-level beam search algorithm, where each ensemble component explores its own prediction step by step at the lower level but all components are synchronized by a "soft voting" mechanism at the upper level. Results on two popular multilingual translation datasets show that EBBS consistently outperforms direct and pivot translations, as well as existing ensemble techniques. Further, we can distill the ensemble's knowledge back to the multilingual model to improve inference efficiency; profoundly, our EBBS-distilled model can even outperform EBBS as it learns from the ensemble knowledge.

Published

2025-04-11

How to Cite

Wen, Y., Shayegh, B., Huang, C., Cao, Y., & Mou, L. (2025). EBBS: An Ensemble with Bi-Level Beam Search for Zero-Shot Machine Translation. Proceedings of the AAAI Conference on Artificial Intelligence, 39(24), 25479–25487. https://doi.org/10.1609/aaai.v39i24.34737

Issue

Section

AAAI Technical Track on Natural Language Processing III