Online Sequence Alignment for Real-Time Audio Transcription by Non-Experts

Authors

  • Walter Lasecki University of Rochester
  • Christopher Miller University of Rochester
  • Donato Borrello Univeristy of Rochester
  • Jeffrey Bigham University of Rochester

DOI:

https://doi.org/10.1609/aaai.v26i1.8420

Keywords:

real-time captioning, real-time transcription, real-time crowdsourcing, assistive technology

Abstract

Real-time transcription provides deaf and hard of hearing people visual access to spoken content, such as classroom instruction, and other live events. Currently, the only reliable source of real-time transcriptions are expensive, highly-trained experts who are able to keep up with speaking rates. Automatic speech recognition is cheaper but produces too many errors in realistic settings. We introduce a new approach in which partial captions from multiple non-experts are combined to produce a high-quality transcription in real-time. We demonstrate the potential of this approach with data collected from 20 non-expert captionists.

Downloads

Published

2021-09-20

How to Cite

Lasecki, W., Miller, C., Borrello, D., & Bigham, J. (2021). Online Sequence Alignment for Real-Time Audio Transcription by Non-Experts. Proceedings of the AAAI Conference on Artificial Intelligence, 26(1), 2437-2438. https://doi.org/10.1609/aaai.v26i1.8420