Generalizing and Improving Bilingual Word Embedding Mappings with a Multi-Step Framework of Linear Transformations

Mikel Artetxe; Gorka Labaka; Eneko Agirre

doi:10.1609/aaai.v32i1.11992

Generalizing and Improving Bilingual Word Embedding Mappings with a Multi-Step Framework of Linear Transformations

Authors

Mikel Artetxe University of the Basque Country (UPV/EHU)
Gorka Labaka University of the Basque Country (UPV/EHU)
Eneko Agirre University of the Basque Country (UPV/EHU)

DOI:

https://doi.org/10.1609/aaai.v32i1.11992

Keywords:

cross-lingual word embeddings, bilingual word embedding mappings, bilingual lexicon extraction

Abstract

Using a dictionary to map independently trained word embeddings to a shared space has shown to be an effective approach to learn bilingual word embeddings. In this work, we propose a multi-step framework of linear transformations that generalizes a substantial body of previous work. The core step of the framework is an orthogonal transformation, and existing methods can be explained in terms of the additional normalization, whitening, re-weighting, de-whitening and dimensionality reduction steps. This allows us to gain new insights into the behavior of existing methods, including the effectiveness of inverse regression, and design a novel variant that obtains the best published results in zero-shot bilingual lexicon extraction. The corresponding software is released as an open source project.

Downloads

Published

2018-04-27

How to Cite

Artetxe, M., Labaka, G., & Agirre, E. (2018). Generalizing and Improving Bilingual Word Embedding Mappings with a Multi-Step Framework of Linear Transformations. Proceedings of the AAAI Conference on Artificial Intelligence, 32(1). https://doi.org/10.1609/aaai.v32i1.11992

Download Citation

Issue

Vol. 32 No. 1 (2018): Thirty-Second AAAI Conference on Artificial Intelligence

Section

Main Track: NLP and Machine Learning

Generalizing and Improving Bilingual Word Embedding Mappings with a Multi-Step Framework of Linear Transformations

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription