The Problems with Proxies: Making Data Work Visible through Requester Practices

Authors

  • Annabel Rothschild Georgia Institute of Technology
  • Ding Wang Google Research
  • Niveditha Jayakumar Vilvanathan Georgia Institute of Technology
  • Lauren Wilcox Georgia Institute of Technology
  • Carl DiSalvo Georgia Institute of Technology
  • Betsy DiSalvo Georgia Institute of Technology

DOI:

https://doi.org/10.1609/aies.v7i1.31721

Abstract

Fairness in AI and ML systems is increasingly linked to the proper treatment and recognition of data workers involved in training dataset development. Yet, those who collect and annotate the data, and thus have the most intimate knowledge of its development, are often excluded from critical discussions. This exclusion prevents data annotators, who are domain experts, from contributing effectively to dataset contextualization. Our investigation into the hiring and engagement practices of 52 data work requesters on platforms like Amazon Mechanical Turk reveals a gap: requesters frequently hold naive or unchallenged notions of worker identities and capabilities and rely on ad-hoc qualification tasks that fail to respect the workers’ expertise. These practices not only undermine the quality of data but also the ethical standards of AI development. To rectify these issues, we advocate for policy changes to enhance how data annotation tasks are designed and managed and to ensure data workers are treated with the respect they deserve.

Downloads

Published

2024-10-16

How to Cite

Rothschild, A., Wang, D., Jayakumar Vilvanathan, N., Wilcox, L., DiSalvo, C., & DiSalvo, B. (2024). The Problems with Proxies: Making Data Work Visible through Requester Practices. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 7(1), 1255-1268. https://doi.org/10.1609/aies.v7i1.31721