Enhancing Bilingual Lexicon Induction via Bi-directional Translation Pair Retrieving
DOI:
https://doi.org/10.1609/aaai.v38i16.29744Keywords:
NLP: Machine Translation, Multilinguality, Cross-Lingual NLP, ML: Representation LearningAbstract
Most Bilingual Lexicon Induction (BLI) methods retrieve word translation pairs by finding the closest target word for a given source word based on cross-lingual word embeddings (WEs). However, we find that solely retrieving translation from the source-to-target perspective leads to some false positive translation pairs, which significantly harm the precision of BLI. To address this problem, we propose a novel and effective method to improve translation pair retrieval in cross-lingual WEs. Specifically, we consider both source-side and target-side perspectives throughout the retrieval process to alleviate false positive word pairings that emanate from a single perspective. On a benchmark dataset of BLI, our proposed method achieves competitive performance compared to existing state-of-the-art (SOTA) methods. It demonstrates effectiveness and robustness across six experimental languages, including similar language pairs and distant language pairs, under both supervised and unsupervised settings.Downloads
Published
2024-03-24
How to Cite
Ding, Q., Cao, H., & Zhao, T. (2024). Enhancing Bilingual Lexicon Induction via Bi-directional Translation Pair Retrieving. Proceedings of the AAAI Conference on Artificial Intelligence, 38(16), 17898-17906. https://doi.org/10.1609/aaai.v38i16.29744
Issue
Section
AAAI Technical Track on Natural Language Processing I