Kid-Whisper: Towards Bridging the Performance Gap in Automatic Speech Recognition for Children VS. Adults

Ahmed Adel Attia; Jing Liu; Wei Ai; Dorottya Demszky; Carol Espy-Wilson

doi:10.1609/aies.v7i1.31618

Kid-Whisper: Towards Bridging the Performance Gap in Automatic Speech Recognition for Children VS. Adults

Authors

Ahmed Adel Attia University of Maryland
Jing Liu University of Maryland
Wei Ai University of Maryland
Dorottya Demszky Stanford University
Carol Espy-Wilson University of Maryland

DOI:

https://doi.org/10.1609/aies.v7i1.31618

Abstract

Recent advancements in Automatic Speech Recognition (ASR) systems, exemplified by Whisper, have demonstrated the potential of these systems to approach human-level performance given sufficient data. However, this progress doesn’t readily extend to ASR for children due to the lim- ited availability of suitable child-specific databases and the distinct characteristics of children’s speech. A recent study investigated leveraging the My Science Tutor (MyST) chil- dren’s speech corpus to enhance Whisper’s performance in recognizing children’s speech. They were able to demon- strate some improvement on a limited testset. This paper builds on these findings by enhancing the utility of the MyST dataset through more efficient data preprocessing. We reduce the Word Error Rate (WER) on the MyST testset 13.93% to 9.11% with Whisper-Small and from 13.23% to 8.61% with Whisper-Medium and show that this improvement can be generalized to unseen datasets. We also highlight important challenges towards improving children’s ASR performance and the effect of fine-tuning in improving the transcription of disfluent speech.

Downloads

Published

2024-10-16

How to Cite

Attia, A. A., Liu, J., Ai, W., Demszky, D., & Espy-Wilson, C. (2024). Kid-Whisper: Towards Bridging the Performance Gap in Automatic Speech Recognition for Children VS. Adults. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 7(1), 74-80. https://doi.org/10.1609/aies.v7i1.31618

Download Citation

Issue

Vol. 7 (2024): Proceedings of the Seventh AAAI/ACM Conference on AI, Ethics, and Society (AIES-24)

Section

Full Archival Papers