Speech Synthesis Data Collection for Visually Impaired Person

Masayuki Ashikawa; Takahiro Kawamura; Akihiko Ohsuga

doi:10.1609/hcomp.v2i1.13206

Speech Synthesis Data Collection for Visually Impaired Person

Authors

Masayuki Ashikawa Toshiba Corporation
Takahiro Kawamura Toshiba Corporation
Akihiko Ohsuga The University of Electro-Communications

DOI:

https://doi.org/10.1609/hcomp.v2i1.13206

Abstract

Crowdsourcing platforms provide attractive solutions for collecting speech synthesis data for visually impaired person. However, quality control problems remain because of low-quality volunteer workers. In this paper, we propose the design of a crowdsourcing system that allows us to devise quality control methods. We introduce four worker selection methods; preprocessing filtering, real-time filtering, post-processing filtering, and guess-processing filtering. These methods include a novel approach that utilizes a collaborative filtering technique in addition to a basic approach involving initial training or use of gold-standard data. These quality control methods improved the quality of collected speech synthesis data. Moreover, we have already collected 140,000 Japanese words from 500 million web data for speech synthesis data.

Downloads

Published

2014-10-14

How to Cite

Ashikawa, M., Kawamura, T., & Ohsuga, A. (2014). Speech Synthesis Data Collection for Visually Impaired Person. Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, 2(1), 3–5. https://doi.org/10.1609/hcomp.v2i1.13206

Download Citation

Issue

Vol. 2 (2014): Second AAAI Conference on Human Computation and Crowdsourcing

Section

Workshop Citizen + X

Speech Synthesis Data Collection for Visually Impaired Person

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information