Active Learning for Crowdsourcing Using Knowledge Transfer


  • Meng Fang University of Technology, Sydney
  • Jie Yin CSIRO
  • Dacheng Tao University of Technology, Sydney



Active Learning, Crowdsourcing, Knowledge Transfer


This paper studies the active learning problem in crowdsourcing settings, where multiple imperfect annotators with varying levels of expertise are available for labeling the data in a given task. Annotations collected from these labelers may be noisy and unreliable, and the quality of labeled data needs to be maintained for data mining tasks. Previous solutions have attempted to estimate individual users' reliability based on existing knowledge in each task, but for this to be effective each task requires a large quantity of labeled data to provide accurate estimates. In practice, annotation budgets for a given task are limited, so each instance can be presented to only a few users, each of whom can only label a few examples. To overcome data scarcity we propose a new probabilistic model that transfers knowledge from abundant unlabeled data in auxiliary domains to help estimate labelers' expertise. Based on this model we present a novel active learning algorithm that: a) simultaneously selects the most informative example and b) queries its label from the labeler with the best expertise. Experiments on both text and image datasets demonstrate that our proposed method outperforms other state-of-the-art active learning methods.




How to Cite

Fang, M., Yin, J., & Tao, D. (2014). Active Learning for Crowdsourcing Using Knowledge Transfer. Proceedings of the AAAI Conference on Artificial Intelligence, 28(1).



Main Track: Novel Machine Learning Algorithms