From Good to Best: Two-Stage Training for Cross-Lingual Machine Reading Comprehension

Nuo Chen; Linjun Shou; Ming Gong; Jian Pei

doi:10.1609/aaai.v36i10.21293

Authors

Nuo Chen ADSPLAB, School of ECE, Peking University, Shenzhen, China
Linjun Shou STCA NLP Group, Microsoft, Beijing
Ming Gong STCA NLP Group, Microsoft, Beijing
Jian Pei School of Computing Science, Simon Fraser University

DOI:

https://doi.org/10.1609/aaai.v36i10.21293

Keywords:

Speech & Natural Language Processing (SNLP)

Abstract

Cross-lingual Machine Reading Comprehension (xMRC) is a challenging task due to the lack of training data in low-resource languages. Recent approaches use training data only in a resource-rich language (such as English) to fine-tune large-scale cross-lingual pre-trained language models, which transfer knowledge from resource-rich languages (source) to low-resource languages (target). Due to the big difference between languages, the model fine-tuned only by the source language may not perform well for target languages. In our study, we make an interesting observation that while the top 1 result predicted by the previous approaches may often fail to hit the ground-truth answer, there are still good chances for the correct answer to be contained in the set of top k predicted results. Intuitively, the previous approaches have empowered the model certain level of capability to roughly distinguish good answers from bad ones. However, without sufficient training data, it is not powerful enough to capture the nuances between the accurate answer and those approximate ones. Based on this observation, we develop a two-stage approach to enhance the model performance. The first stage targets at recall; we design a hard-learning (HL) algorithm to maximize the likelihood that the top k predictions contain the accurate answer. The second stage focuses on precision, where an answer-aware contrastive learning (AA-CL) mechanism is developed to learn the minute difference between the accurate answer and other candidates. Extensive experiments show that our model significantly outperforms strong baselines on two cross-lingual MRC benchmark datasets.

From Good to Best: Two-Stage Training for Cross-Lingual Machine Reading Comprehension

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription