Debiasing Intrinsic Bias and Application Bias Jointly via Invariant Risk Minimization (Student Abstract)

Authors

  • Yuzhou Mao University of Electronic Science and Technology of China
  • Liu Yu University of Electronic Science and Technology of China
  • Yi Yang Hong Kong University of Science and Technology
  • Fan Zhou University of Electronic Science and Technology of China
  • Ting Zhong University of Electronic Science and Technology of China

DOI:

https://doi.org/10.1609/aaai.v37i13.27000

Keywords:

Social Biases, Debiasing Method, Invariant Risk Minimization, Causal

Abstract

Demographic biases and social stereotypes are common in pretrained language models (PLMs), while the fine-tuning in downstream applications can also produce new biases or amplify the impact of the original biases. Existing works separate the debiasing from the fine-tuning procedure, which results in a gap between intrinsic bias and application bias. In this work, we propose a debiasing framework CauDebias to eliminate both biases, which directly combines debiasing with fine-tuning and can be applied for any PLMs in downstream tasks. We distinguish the bias-relevant (non-causal factors) and label-relevant (causal factors) parts in sentences from a causal invariant perspective. Specifically, we perform intervention on non-causal factors in different demographic groups, and then devise an invariant risk minimization loss to trade-off performance between bias mitigation and task accuracy. Experimental results on three downstream tasks show that our CauDebias can remarkably reduce biases in PLMs while minimizing the impact on downstream tasks.

Downloads

Published

2023-09-06

How to Cite

Mao, Y., Yu, L., Yang, Y., Zhou, F., & Zhong, T. (2023). Debiasing Intrinsic Bias and Application Bias Jointly via Invariant Risk Minimization (Student Abstract). Proceedings of the AAAI Conference on Artificial Intelligence, 37(13), 16280-16281. https://doi.org/10.1609/aaai.v37i13.27000