Sound Check: Auditing Recent Audio Dataset Practices

William Agnew; Julia Barnett; Annie Chu; Rachel Hong; Michael Feffer; Robin Netzorg; Harry H. Jiang; Ezra Awumey; Sauvik Das

doi:10.1609/aies.v8i1.36528

Authors

William Agnew Carnegie Mellon University
Julia Barnett Northwestern University
Annie Chu Northwestern University
Rachel Hong University of Washington
Michael Feffer Carnegie Mellon University
Robin Netzorg University of California Berkeley
Harry H. Jiang Carnegie Mellon University
Ezra Awumey Carnegie Mellon University
Sauvik Das Carnegie Mellon University

DOI:

https://doi.org/10.1609/aies.v8i1.36528

Abstract

Audio AI models are increasingly used for a broad range of applications including music and sound generation, text-to- speech (TTS), voice cloning, emotion analysis, transcription, and audio classification. However, we have little understanding of the datasets used to create audio AI models, a gap that leaves the field without a powerful tool for understanding potential biases, toxicity, copyright violations, and other ethical and performance issues. We conduct a mapping literature review of hundreds of audio datasets used in recent music, sound, and speech AI papers. We first assess the sourcing, size, and usage of these datasets, finding that while there are hundreds of audio datasets, few are widely used. Next, we identify nine representative datasets and conduct several analyses to understand bias, toxicity, representation, and quality. We find that these datasets are often biased against women, have stereotypes about marginalized communities, and contain significant amounts of copyrighted work. We also find that audio datasets often come with scant documentation. To address this gap, we extend Gebru’s datasheets for datasets to audio data, providing domain-specific documentation guidance. Finally, to facilitate public exploration of dataset contents and accountability, we developed an audio datasets exploration web tool which is available below in our links, along with our code and an extended version of our work including the appendix and augmented datasheets for datasets. Content warning: this paper contains discussions of offensive language.

Sound Check: Auditing Recent Audio Dataset Practices

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section