Exploring “Just Noticeable” Group Fairness in Rankings

Mallak Alkhathlan; Hilson Shrestha; Lane Harrison; Elke Rundensteiner

doi:10.1609/aies.v8i1.36532

Authors

Mallak Alkhathlan Worcester Polytechnic Institute
Hilson Shrestha Worcester Polytechnic Institute
Lane Harrison Worcester Polytechnic Institute
Elke Rundensteiner Worcester Polytechnic Institute

DOI:

https://doi.org/10.1609/aies.v8i1.36532

Abstract

The plethora of fairness metrics developed for ranking-based decision-making raises the question which metrics align best with people’s perceptions of fairness, and why? Most prior studies examining people’s perceptions of fairness metrics tend to use ordinal rating scales (e.g., Likert scales). However, such scales can be ambiguous in their interpretation across participants and offer imprecise connections to specific interface features. We address this gap by adapting two-alternative forced choice methodologies—used extensively outside the fairness community for comparing visual stimuli—to quantitatively compare participant perceptions, fairness metrics, and ranking characteristics. We report a crowdsourced experiment with 224 participants across four conditions: two popular rank fairness metrics—ARP and NDKL—and two ranking characteristics—lists of 20 and 100 candidates—resulting in over 170,000 individual judgments. Our quantitative results show systematic patterns of differences in the metrics, as well as surprising exceptions where fairness metrics disagree with people’s perceptions. Our qualitative analysis reveals an interplay between cognitive and visual strategies that affects people’s perceptions of fairness. From these results, we discuss future work in aligning fairness metrics with people’s perceptions, and highlight the need and benefits of expanding methodologies for fairness studies.

Exploring “Just Noticeable” Group Fairness in Rankings

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section