Bounding Uncertainty for Active Batch Selection

Hanmo Wang; Runwu Zhou; Yi-Dong Shen

doi:10.1609/aaai.v33i01.33015240

Authors

Hanmo Wang Chinese Academy of Sciences
Runwu Zhou Chinese Academy of Sciences
Yi-Dong Shen Chinese Academy of Sciences

DOI:

https://doi.org/10.1609/aaai.v33i01.33015240

Abstract

The success of batch mode active learning (BMAL) methods lies in selecting both representative and uncertain samples. Representative samples quickly capture the global structure of the whole dataset, while the uncertain ones refine the decision boundary. There are two principles, namely the direct approach and the screening approach, to make a trade-off between representativeness and uncertainty. Although widely used in literature, little is known about the relationship between these two principles. In this paper, we discover that the two approaches both have shortcomings in the initial stage of BMAL. To alleviate the shortcomings, we bound the certainty scores of unlabeled samples from below and directly combine this lower-bounded certainty with representativeness in the objective function. Additionally, we show that the two aforementioned approaches are mathematically equivalent to two special cases of our approach. To the best of our knowledge, this is the first work that tries to generalize the direct and screening approaches. The objective function is then solved by super-modularity optimization. Extensive experiments on fifteen datasets indicate that our method has significantly higher classification accuracy on testing data than the latest state-of-the-art BMAL methods, and also scales better even when the size of the unlabeled pool reaches 10⁶.

Bounding Uncertainty for Active Batch Selection

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription