Speech Synthesis Data Collection for Visually Impaired Person


  • Masayuki Ashikawa Toshiba Corporation
  • Takahiro Kawamura Toshiba Corporation
  • Akihiko Ohsuga The University of Electro-Communications


Crowdsourcing platforms provide attractive solutions for collecting speech synthesis data for visually impaired person. However, quality control problems remain because of low-quality volunteer workers. In this paper, we propose the design of a crowdsourcing system that allows us to devise quality control methods. We introduce four worker selection methods; preprocessing filtering, real-time filtering, post-processing filtering, and guess-processing filtering. These methods include a novel approach that utilizes a collaborative filtering technique in addition to a basic approach involving initial training or use of gold-standard data. These quality control methods improved the quality of collected speech synthesis data. Moreover, we have already collected 140,000 Japanese words from 500 million web data for speech synthesis data.




How to Cite

Ashikawa, M., Kawamura, T., & Ohsuga, A. (2014). Speech Synthesis Data Collection for Visually Impaired Person. Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, 2(1). Retrieved from https://ojs.aaai.org/index.php/HCOMP/article/view/13206