Enhancing Bilingual Lexicon Induction via Bi-directional Translation Pair Retrieving

Authors

  • Qiuyu Ding Harbin Institute of Technology
  • Hailong Cao Harbin Institute of Technology
  • Tiejun Zhao Harbin Institute of Technology

DOI:

https://doi.org/10.1609/aaai.v38i16.29744

Keywords:

NLP: Machine Translation, Multilinguality, Cross-Lingual NLP, ML: Representation Learning

Abstract

Most Bilingual Lexicon Induction (BLI) methods retrieve word translation pairs by finding the closest target word for a given source word based on cross-lingual word embeddings (WEs). However, we find that solely retrieving translation from the source-to-target perspective leads to some false positive translation pairs, which significantly harm the precision of BLI. To address this problem, we propose a novel and effective method to improve translation pair retrieval in cross-lingual WEs. Specifically, we consider both source-side and target-side perspectives throughout the retrieval process to alleviate false positive word pairings that emanate from a single perspective. On a benchmark dataset of BLI, our proposed method achieves competitive performance compared to existing state-of-the-art (SOTA) methods. It demonstrates effectiveness and robustness across six experimental languages, including similar language pairs and distant language pairs, under both supervised and unsupervised settings.

Published

2024-03-24

How to Cite

Ding, Q., Cao, H., & Zhao, T. (2024). Enhancing Bilingual Lexicon Induction via Bi-directional Translation Pair Retrieving. Proceedings of the AAAI Conference on Artificial Intelligence, 38(16), 17898-17906. https://doi.org/10.1609/aaai.v38i16.29744

Issue

Section

AAAI Technical Track on Natural Language Processing I