Biases Mitigation and Expressiveness Preservation in Language Models: A Comprehensive Pipeline (Student Abstract)

Authors

  • Liu Yu University of Electronic Science and Technology of China
  • Ludie Guo University of Electronic Science and Technology of China
  • Ping Kuang University of Electronic Science and Technology of China
  • Fan Zhou University of Electronic Science and Technology of China

DOI:

https://doi.org/10.1609/aaai.v38i21.30532

Keywords:

Natural Language Processing, Social Bias, Bias Mitigation

Abstract

Pre-trained language models (PLMs) have greatly transformed various downstream tasks, yet frequently display social biases from training data, raising fairness concerns. Recent efforts to debias PLMs come with limitations: they either fine-tune the entire parameters in PLMs, which is time-consuming and disregards the expressiveness of PLMs, or ignore the reintroducing biases from downstream tasks when applying debiased models to them. Hence, we propose a two-stage pipeline to mitigate biases from both internal and downstream contexts while preserving expressiveness in language models. Specifically, for the debiasing procedure, we resort to continuous prefix-tuning, not fully fine-tuning the PLM, in which we design a debiasing term for optimization and an alignment term to keep words’ relative distances and ensure the model's expressiveness. For downstream tasks, we perform causal intervention across different demographic groups for invariant predictions. Results on three GLUE tasks show our method alleviates biases from internal and downstream contexts, while keeping PLM expressiveness intact.

Published

2024-03-24

How to Cite

Yu, L., Guo, L., Kuang, P., & Zhou, F. (2024). Biases Mitigation and Expressiveness Preservation in Language Models: A Comprehensive Pipeline (Student Abstract). Proceedings of the AAAI Conference on Artificial Intelligence, 38(21), 23701-23702. https://doi.org/10.1609/aaai.v38i21.30532