Navigating Towards Fairness with Data Selection
DOI:
https://doi.org/10.1609/aaai.v39i21.34422Abstract
Machine learning algorithms often struggle to eliminate inherent data biases, particularly those arising from unreliable labels, which poses a significant challenge in ensuring fairness. Existing fairness techniques that address label bias typically involve modifying models and intervening in the training process, but these lack flexibility for large-scale datasets. To address this limitation, we introduce a data selection method designed to efficiently and flexibly mitigate label bias, tailored to more practical needs. Our approach utilizes a zero-shot predictor as a proxy model that simulates training on a clean holdout set. This strategy, supported by peer predictions, ensures the fairness of the proxy model and eliminates the need for an additional holdout set, which is a common requirement in previous methods. Without altering the classifier's architecture, our modality-agnostic method effectively selects appropriate training data and has proven efficient and effective in handling label bias and improving fairness across diverse datasets in experimental evaluations.Downloads
Published
2025-04-11
How to Cite
Zhang, Y., Li, Z., Wang, Y., Chen, F., Fan, X., & Zhou, F. (2025). Navigating Towards Fairness with Data Selection. Proceedings of the AAAI Conference on Artificial Intelligence, 39(21), 22632–22640. https://doi.org/10.1609/aaai.v39i21.34422
Issue
Section
AAAI Technical Track on Machine Learning VII