Semi-Supervised Learning From Crowds Using Deep Generative Models

Kyohei Atarashi; Satoshi Oyama; Masahito Kurihara

doi:10.1609/aaai.v32i1.11513

Authors

Kyohei Atarashi Hokkaido University
Satoshi Oyama Hokkaido University; RIKEN AIP
Masahito Kurihara Hokkaido University

DOI:

https://doi.org/10.1609/aaai.v32i1.11513

Keywords:

Crowdsourcing, Semi-supervised Learning, Deep Learning

Abstract

Although supervised learning requires a labeled dataset, obtaining labels from experts is generally expensive. For this reason, crowdsourcing services are attracting attention in the field of machine learning as a way to collect labels at relatively low cost. However, the labels obtained by crowdsourcing, i.e., from non-expert workers, are often noisy. A number of methods have thus been devised for inferring true labels, and several methods have been proposed for learning classifiers directly from crowdsourced labels, referred to as "learning from crowds." A more practical problem is learning from crowdsourced labeled data and unlabeled data, i.e., "semi-supervised learning from crowds." This paper presents a novel generative model of the labeling process in crowdsourcing. It leverages unlabeled data effectively by introducing latent features and a data distribution. Because the data distribution can be complicated, we use a deep neural network for the data distribution. Therefore, our model can be regarded as a kind of deep generative model. The problems caused by the intractability of latent variable posteriors is solved by introducing an inference model. The experiments show that it outperforms four existing models, including a baseline model, on the MNIST dataset with simulated workers and the Rotten Tomatoes movie review dataset with Amazon Mechanical Turk workers.

Semi-Supervised Learning From Crowds Using Deep Generative Models

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information