Anytime Active Learning

Authors

  • Maria Ramirez-Loaiza Illinois Institute of Technology
  • Aron Culotta Illinois Institute of Technology
  • Mustafa Bilgic Illinois Institute of Technology

DOI:

https://doi.org/10.1609/aaai.v28i1.9015

Keywords:

active learning, non-uniform labeling cost, document classification

Abstract

A common bottleneck in deploying supervised learning systems is collecting human-annotated examples. In many domains, annotators form an opinion about the label of an example incrementally -- e.g., each additional word read from a document or each additional minute spent inspecting a video helps inform the annotation. In this paper, we investigate whether we can train learning systems more efficiently by requesting an annotation before inspection is fully complete -- e.g., after reading only 25 words of a document. While doing so may reduce the overall annotation time, it also introduces the risk that the annotator might not be able to provide a label if interrupted too early. We propose an anytime active learning approach that optimizes the annotation time and response rate simultaneously. We conduct user studies on two document classification datasets and develop simulated annotators that mimic the users. Our simulated experiments show that anytime active learning outperforms several baselines on these two datasets. For example, with an annotation budget of one hour, training a classifier by annotating the first 25 words of each document reduces classification error by 17% over annotating the first 100 words of each document.

Downloads

Published

2014-06-21

How to Cite

Ramirez-Loaiza, M., Culotta, A., & Bilgic, M. (2014). Anytime Active Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 28(1). https://doi.org/10.1609/aaai.v28i1.9015

Issue

Section

Main Track: Novel Machine Learning Algorithms