CASTLE: Crowd-Assisted System for Text Labeling and Extraction

Authors

  • Sean Goldberg University of Florida
  • Daisy Wang University of Florida
  • Tim Kraska Brown University

DOI:

https://doi.org/10.1609/hcomp.v1i1.13087

Keywords:

crowdsourcing, information extraction

Abstract

The amount of text data has been growing exponentially and with it the demand for improved information extraction (IE) efforts to analyze and query such data. While automatic IE systems have proven useful in controlled experiments, in practice the gap between machine learning extraction and human extraction is still quite large. In this paper, we propose a system that uses crowdsourcing techniques to help close this gap. One of the fundamental issues inherent in using a large-scale human workforce is deciding the optimal questions to pose to the crowd. We demonstrate novel solutions using mutual information and token clustering techniques in the domain of bibliographic citation extraction. Our experiments show promising results in using crowd assistance as a cost-effective way to close up the ”last mile” between extraction systems and a human annotator.

Downloads

Published

2013-11-03

How to Cite

Goldberg, S., Wang, D., & Kraska, T. (2013). CASTLE: Crowd-Assisted System for Text Labeling and Extraction. Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, 1(1), 51-59. https://doi.org/10.1609/hcomp.v1i1.13087