Debiasing NLU Models via Causal Intervention and Counterfactual Reasoning

Bing Tian; Yixin Cao; Yong Zhang; Chunxiao Xing

doi:10.1609/aaai.v36i10.21389

Authors

Bing Tian Tsinghua University
Yixin Cao Singapore Management University
Yong Zhang Tsinghua University, China
Chunxiao Xing Tsinghua University

DOI:

https://doi.org/10.1609/aaai.v36i10.21389

Keywords:

Speech & Natural Language Processing (SNLP)

Abstract

Recent studies have shown that strong Natural Language Understanding (NLU) models are prone to relying on annotation biases of the datasets as a shortcut, which goes against the underlying mechanisms of the task of interest. To reduce such biases, several recent works introduce debiasing methods to regularize the training process of targeted NLU models. In this paper, we provide a new perspective with causal inference to find out the bias. On one hand, we show that there is an unobserved confounder for the natural language utterances and their respective classes, leading to spurious correlations from training data. To remove such confounder, the backdoor adjustment with causal intervention is utilized to find the true causal effect, which makes the training process fundamentally different from the traditional likelihood estimation. On the other hand, in inference process, we formulate the bias as the direct causal effect and remove it by pursuing the indirect causal effect with counterfactual reasoning. We conduct experiments on large-scale natural language inference and fact verification benchmarks, evaluating on bias sensitive datasets that are specifically designed to assess the robustness of models against known biases in the training data. Experimental results show that our proposed debiasing framework outperforms previous state-of-the-art debiasing methods while maintaining the original in-distribution performance.

Debiasing NLU Models via Causal Intervention and Counterfactual Reasoning

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Subscription