The Crowd-Median Algorithm

Authors

  • Hannes Heikinheimo Rovio Entertainment Ltd
  • Antti Ukkonen Aalto University

DOI:

https://doi.org/10.1609/hcomp.v1i1.13079

Keywords:

human computation, crowdsourcing, algorithms, median, clustering, kmeans

Abstract

The power of human computation is founded on the capabilities of humans to process qualitative information in a manner that is hard to reproduce with a computer. However, all machine learning algorithms rely on mathematical operations, such as sums, averages, least squares etc. that are less suitable for human computation. This paper is an effort to combine these two aspects of data processing. We consider the problem of computing a centroid of a data set, a key component in many data-analysis applications such as clustering, using a very simple human intelligence task (HIT). In this task the workers must choose the outlier from a set of three items. After presenting a number of such triplets to the workers, the item chosen the least number of times as the outlier is selected as the centroid. We provide a proof that the centroid determined by this procedure is equal the mean of a univariate normal distribution. Furthermore, as a demonstration of the viability of our method, we implement a human computation based variant of the k-means clustering algorithm. We present experiments where the proposed method is used to find an "average" image in a collection, and cluster images to semantic categories.

Downloads

Published

2013-11-03

How to Cite

Heikinheimo, H., & Ukkonen, A. (2013). The Crowd-Median Algorithm. Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, 1(1), 69-77. https://doi.org/10.1609/hcomp.v1i1.13079