Fair or Fare? Understanding Automated Transcription Error Bias in Social Media and Videoconferencing Platforms

Authors

  • Daniel J Dubois Northeastern University
  • Nicole Holliday Pomona College
  • Kaveh Waddell Consumer Reports Stanford University
  • David Choffnes Northeastern University

DOI:

https://doi.org/10.1609/icwsm.v18i1.31320

Abstract

As remote work and learning increases in popularity, individuals, especially those with hearing impairments or who speak English as a second language, may depend on automated transcriptions to participate in business, school, entertainment, or basic communication. In this work, we investigate the automated transcription accuracy of seven popular social media and videoconferencing platforms with respect to some personal characteristics of their users, including gender, age, race, first language, speech rate, F0 frequency, and speech readability. We performed this investigation on a new corpus of 194 hours of English monologues by 846 TED talk speakers. Our results show the presence of significant bias, with transcripts less accurate for speakers that are male or non-native English speakers. We also observe differences in accuracy among platforms for different types of speakers. These results indicate that, while platforms have improved their automatic captioning, much work remains to make captions accessible for a wider variety of speakers and listeners.

Downloads

Published

2024-05-28

How to Cite

Dubois, D. J., Holliday, N., Waddell, K., & Choffnes, D. (2024). Fair or Fare? Understanding Automated Transcription Error Bias in Social Media and Videoconferencing Platforms. Proceedings of the International AAAI Conference on Web and Social Media, 18(1), 367-380. https://doi.org/10.1609/icwsm.v18i1.31320