Exploring “Just Noticeable” Group Fairness in Rankings
DOI:
https://doi.org/10.1609/aies.v8i1.36532Abstract
The plethora of fairness metrics developed for ranking-based decision-making raises the question which metrics align best with people’s perceptions of fairness, and why? Most prior studies examining people’s perceptions of fairness metrics tend to use ordinal rating scales (e.g., Likert scales). However, such scales can be ambiguous in their interpretation across participants and offer imprecise connections to specific interface features. We address this gap by adapting two-alternative forced choice methodologies—used extensively outside the fairness community for comparing visual stimuli—to quantitatively compare participant perceptions, fairness metrics, and ranking characteristics. We report a crowdsourced experiment with 224 participants across four conditions: two popular rank fairness metrics—ARP and NDKL—and two ranking characteristics—lists of 20 and 100 candidates—resulting in over 170,000 individual judgments. Our quantitative results show systematic patterns of differences in the metrics, as well as surprising exceptions where fairness metrics disagree with people’s perceptions. Our qualitative analysis reveals an interplay between cognitive and visual strategies that affects people’s perceptions of fairness. From these results, we discuss future work in aligning fairness metrics with people’s perceptions, and highlight the need and benefits of expanding methodologies for fairness studies.Downloads
Published
2025-10-15
How to Cite
Alkhathlan, M., Shrestha, H., Harrison, L., & Rundensteiner, E. (2025). Exploring “Just Noticeable” Group Fairness in Rankings. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 8(1), 76-89. https://doi.org/10.1609/aies.v8i1.36532