Sound Check: Auditing Recent Audio Dataset Practices
DOI:
https://doi.org/10.1609/aies.v8i1.36528Abstract
Audio AI models are increasingly used for a broad range of applications including music and sound generation, text-to- speech (TTS), voice cloning, emotion analysis, transcription, and audio classification. However, we have little understanding of the datasets used to create audio AI models, a gap that leaves the field without a powerful tool for understanding potential biases, toxicity, copyright violations, and other ethical and performance issues. We conduct a mapping literature review of hundreds of audio datasets used in recent music, sound, and speech AI papers. We first assess the sourcing, size, and usage of these datasets, finding that while there are hundreds of audio datasets, few are widely used. Next, we identify nine representative datasets and conduct several analyses to understand bias, toxicity, representation, and quality. We find that these datasets are often biased against women, have stereotypes about marginalized communities, and contain significant amounts of copyrighted work. We also find that audio datasets often come with scant documentation. To address this gap, we extend Gebru’s datasheets for datasets to audio data, providing domain-specific documentation guidance. Finally, to facilitate public exploration of dataset contents and accountability, we developed an audio datasets exploration web tool which is available below in our links, along with our code and an extended version of our work including the appendix and augmented datasheets for datasets. Content warning: this paper contains discussions of offensive language.Downloads
Published
2025-10-15
How to Cite
Agnew, W., Barnett, J., Chu, A., Hong, R., Feffer, M., Netzorg, R., Jiang, H. H., Awumey, E., & Das, S. (2025). Sound Check: Auditing Recent Audio Dataset Practices. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 8(1), 26-40. https://doi.org/10.1609/aies.v8i1.36528