Unsupervised Interlingual Semantic Representations from Sentence Embeddings for Zero-Shot Cross-Lingual Transfer

Authors

  • Channy Hong Harvard University
  • Jaeyeon Lee Superb AI Inc.
  • Jungkwon Lee Superb AI Inc.

DOI:

https://doi.org/10.1609/aaai.v34i05.6302

Abstract

As numerous modern NLP models demonstrate high-performance in various tasks when trained with resource-rich language data sets such as those of English, there has been a shift in attention to the idea of applying such learning to low-resource languages via zero-shot or few-shot cross-lingual transfer. While the most prominent efforts made previously on achieving this feat entails the use of parallel corpora for sentence alignment training, we seek to generalize further by assuming plausible scenarios in which such parallel data sets are unavailable. In this work, we present a novel architecture for training interlingual semantic representations on top of sentence embeddings in a completely unsupervised manner, and demonstrate its effectiveness in zero-shot cross-lingual transfer in natural language inference task. Furthermore, we showcase a method of leveraging this framework in a few-shot scenario, and finally analyze the distributional and permutational alignment across languages of these interlingual semantic representations.

Downloads

Published

2020-04-03

How to Cite

Hong, C., Lee, J., & Lee, J. (2020). Unsupervised Interlingual Semantic Representations from Sentence Embeddings for Zero-Shot Cross-Lingual Transfer. Proceedings of the AAAI Conference on Artificial Intelligence, 34(05), 7944-7951. https://doi.org/10.1609/aaai.v34i05.6302

Issue

Section

AAAI Technical Track: Natural Language Processing