Can Private Machine Learning Be Fair?

Authors

  • Joseph Rance University of Cambridge
  • Filip Svoboda University of Cambridge

DOI:

https://doi.org/10.1609/aaai.v39i19.34216

Abstract

We show that current SOTA methods for privately and fairly training models are unreliable in many practical scenarios. Specifically, we (1) introduce a new type of adversarial attack that seeks to introduce unfairness into private model training, and (2) demonstrate that the use of methods for training on private data that are robust to adversarial attacks often leads to unfair models, regardless of the use of fairness-enhancing training methods. This leads to a dilemma when attempting to train fair models on private data: either (A) we use a robust training method which may introduce unfairness to the model itself, or (B) we train models which are vulnerable to adversarial attacks that introduce unfairness. This paper highlights flaws in robust learning methods when training fair models, yielding a new perspective for the design of robust and private learning systems.

Downloads

Published

2025-04-11

How to Cite

Rance, J., & Svoboda, F. (2025). Can Private Machine Learning Be Fair?. Proceedings of the AAAI Conference on Artificial Intelligence, 39(19), 20121–20129. https://doi.org/10.1609/aaai.v39i19.34216

Issue

Section

AAAI Technical Track on Machine Learning V