XLM-K: Improving Cross-Lingual Language Model Pre-training with Multilingual Knowledge

Authors

  • Xiaoze Jiang Beihang University, Beijing, China
  • Yaobo Liang Microsoft Research Asia, Beijing, China
  • Weizhu Chen Microsoft Azure AI, Redmond, WA, USA
  • Nan Duan Microsoft Research Asia, Beijing, China

DOI:

https://doi.org/10.1609/aaai.v36i10.21330

Keywords:

Speech & Natural Language Processing (SNLP)

Abstract

Cross-lingual pre-training has achieved great successes using monolingual and bilingual plain text corpora. However, most pre-trained models neglect multilingual knowledge, which is language agnostic but comprises abundant cross-lingual structure alignment. In this paper, we propose XLM-K, a cross-lingual language model incorporating multilingual knowledge in pre-training. XLM-K augments existing multilingual pre-training with two knowledge tasks, namely Masked Entity Prediction Task and Object Entailment Task. We evaluate XLM-K on MLQA, NER and XNLI. Experimental results clearly demonstrate significant improvements over existing multilingual language models. The results on MLQA and NER exhibit the superiority of XLM-K in knowledge related tasks. The success in XNLI shows a better cross-lingual transferability obtained in XLM-K. What is more, we provide a detailed probing analysis to confirm the desired knowledge captured in our pre-training regimen. The code is available at https://github.com/microsoft/Unicoder/tree/master/pretraining/xlmk.

Downloads

Published

2022-06-28

How to Cite

Jiang, X., Liang, Y., Chen, W., & Duan, N. (2022). XLM-K: Improving Cross-Lingual Language Model Pre-training with Multilingual Knowledge. Proceedings of the AAAI Conference on Artificial Intelligence, 36(10), 10840-10848. https://doi.org/10.1609/aaai.v36i10.21330

Issue

Section

AAAI Technical Track on Speech and Natural Language Processing