Social Bias Meets Data Bias: The Impacts of Labeling and Measurement Errors on Fairness Criteria

Authors

  • Yiqiao Liao The Ohio State University
  • Parinaz Naghizadeh The Ohio State University

DOI:

https://doi.org/10.1609/aaai.v37i7.26054

Keywords:

ML: Bias and Fairness, PEAI: Societal Impact of AI

Abstract

Although many fairness criteria have been proposed to ensure that machine learning algorithms do not exhibit or amplify our existing social biases, these algorithms are trained on datasets that can themselves be statistically biased. In this paper, we investigate the robustness of existing (demographic) fairness criteria when the algorithm is trained on biased data. We consider two forms of dataset bias: errors by prior decision makers in the labeling process, and errors in the measurement of the features of disadvantaged individuals. We analytically show that some constraints (such as Demographic Parity) can remain robust when facing certain statistical biases, while others (such as Equalized Odds) are significantly violated if trained on biased data. We provide numerical experiments based on three real-world datasets (the FICO, Adult, and German credit score datasets) supporting our analytical findings. While fairness criteria are primarily chosen under normative considerations in practice, our results show that naively applying a fairness constraint can lead to not only a loss in utility for the decision maker, but more severe unfairness when data bias exists. Thus, understanding how fairness criteria react to different forms of data bias presents a critical guideline for choosing among existing fairness criteria, or for proposing new criteria, when available datasets may be biased.

Downloads

Published

2023-06-26

How to Cite

Liao, Y., & Naghizadeh, P. (2023). Social Bias Meets Data Bias: The Impacts of Labeling and Measurement Errors on Fairness Criteria. Proceedings of the AAAI Conference on Artificial Intelligence, 37(7), 8764-8772. https://doi.org/10.1609/aaai.v37i7.26054

Issue

Section

AAAI Technical Track on Machine Learning II