TY - JOUR AU - Kimura, Masanari AU - Wakabayashi, Kei AU - Morishima, Atsuyuki PY - 2020/10/01 Y2 - 2024/03/28 TI - Batch Prioritization of Data Labeling Tasks for Training Classifiers JF - Proceedings of the AAAI Conference on Human Computation and Crowdsourcing JA - HCOMP VL - 8 IS - 1 SE - Short Papers DO - 10.1609/hcomp.v8i1.7476 UR - https://ojs.aaai.org/index.php/HCOMP/article/view/7476 SP - 163-167 AB - <p class="abstract">In a data labeling process for building machine learning, the choice of labeling data instances is known to have a significant impact on the performance of classifiers. So far, the study of active learning has addressed the issue of how to choose the subset by prioritizing the data instances based on the state of the current classifier. However, the active learning approach has two drawbacks that (i) require a training loop to update the priorities of labeling tasks and (ii) require us to choose a specific active learner while we do not know the optimal classification model. In this paper, we propose a new framework of priority-aware labeling system that allows a parallel task assignment to crowd workers without assuming a particular classifier, which is based on novel methods called “batch prioritization” and “label expansion”. We conducted experiments with multiple datasets to examine the effectiveness of the approach and found that the proposed method improves the performance of the final classifiers more quickly than the active learning approach despite that the labeling tasks can be processed in a fully parallel manner.</p> ER -