Analogy Training Multilingual Encoders

Nicolas Garneau; Mareike Hartmann; Anders Sandholm; Sebastian Ruder; Ivan Vulić; Anders Søgaard

doi:10.1609/aaai.v35i14.17524

Authors

Nicolas Garneau Université Laval
Mareike Hartmann University of Copenhagen
Anders Sandholm Google Research
Sebastian Ruder DeepMind
Ivan Vulić University of Cambridge
Anders Søgaard University of Copenhagen

DOI:

https://doi.org/10.1609/aaai.v35i14.17524

Keywords:

Language Models

Abstract

Language encoders encode words and phrases in ways that capture their local semantic relatedness, but are known to be globally inconsistent. Global inconsistency can seemingly be corrected for, in part, by leveraging signals from knowledge bases, but previous results are partial and limited to monolingual English encoders. We extract a large-scale multilingual, multi-word analogy dataset from Wikidata for diagnosing and correcting for global inconsistencies, and then implement a four-way Siamese BERT architecture for grounding multilingual BERT (mBERT) in Wikidata through analogy training. We show that analogy training not only improves the global consistency of mBERT, as well as the isomorphism of language-specific subspaces, but also leads to consistent gains on downstream tasks such as bilingual dictionary induction and sentence retrieval.

Analogy Training Multilingual Encoders

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information