Cost-Effective HITs for Relative Similarity Comparisons

Michael Wilber; Iljung Kwak; Serge Belongie

doi:10.1609/hcomp.v2i1.13152

Authors

Michael Wilber Cornell University
Iljung Kwak University of California, San Diego
Serge Belongie Cornell University

DOI:

https://doi.org/10.1609/hcomp.v2i1.13152

Keywords:

Relative comparisons, Embedding, Ordinal embedding, Perceptual similarity, Triplet embedding, User interface

Abstract

Similarity comparisons of the form "Is object a more similar to b than to c?" form a useful foundation in several computer vision and machine learning applications. Unfortunately, an embedding of n points is only uniquely specified by n³ triplets, making collecting every triplet an expensive task. In noticing this difficulty, other researchers investigated more intelligent triplet sampling techniques, but they do not study their effectiveness or their potential drawbacks. Although it is important to reduce the number of collected triplets to generate a good embedding, it is also important to understand how best to display a triplet collection task to the user to better respect the worker's human constraints. In this work, we explore an alternative method for collecting triplets and analyze its financial cost, collection speed, and worker happiness as a function of the final embedding quality. We propose best practices for creating cost effective human intelligence tasks for collecting triplets. We show that rather than changing the sampling algorithm, simple changes to the crowdsourcing UI can drastically decrease the cost of collecting similarity comparisons. Finally, we provide a food similarity dataset as well as the labels collected from crowd workers.

Cost-Effective HITs for Relative Similarity Comparisons

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information