The Problems with Proxies: Making Data Work Visible through Requester Practices

Annabel Rothschild; Ding Wang; Niveditha Jayakumar Vilvanathan; Lauren Wilcox; Carl DiSalvo; Betsy DiSalvo

doi:10.1609/aies.v7i1.31721

Authors

Annabel Rothschild Georgia Institute of Technology
Ding Wang Google Research
Niveditha Jayakumar Vilvanathan Georgia Institute of Technology
Lauren Wilcox Georgia Institute of Technology
Carl DiSalvo Georgia Institute of Technology
Betsy DiSalvo Georgia Institute of Technology

DOI:

https://doi.org/10.1609/aies.v7i1.31721

Abstract

Fairness in AI and ML systems is increasingly linked to the proper treatment and recognition of data workers involved in training dataset development. Yet, those who collect and annotate the data, and thus have the most intimate knowledge of its development, are often excluded from critical discussions. This exclusion prevents data annotators, who are domain experts, from contributing effectively to dataset contextualization. Our investigation into the hiring and engagement practices of 52 data work requesters on platforms like Amazon Mechanical Turk reveals a gap: requesters frequently hold naive or unchallenged notions of worker identities and capabilities and rely on ad-hoc qualification tasks that fail to respect the workers’ expertise. These practices not only undermine the quality of data but also the ethical standards of AI development. To rectify these issues, we advocate for policy changes to enhance how data annotation tasks are designed and managed and to ensure data workers are treated with the respect they deserve.

The Problems with Proxies: Making Data Work Visible through Requester Practices

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section