Beam Search Optimized Batch Bayesian Active Learning


  • Jingyu Sun NTT Computer and Data Science Laboratories
  • Hongjie Zhai NTT Software Innovation Center
  • Osamu Saisho NTT Social Informatics Laboratories
  • Susumu Takeuchi NTT Computer and Data Science Laboratories



HAI: Human-in-the-Loop Machine Learning, HAI: Applications, ML: Active Learning, ML: Applications, ML: Deep Neural Architectures, ML: Evaluation and Analysis (Machine Learning)


Active Learning is an essential method for label-efficient deep learning. As a Bayesian active learning method, Bayesian Active Learning by Disagreement (BALD) successfully selects the most representative samples by maximizing the mutual information between the model prediction and model parameters. However, when applied to a batch acquisition mode, like batch construction with greedy search, BALD suffers from poor performance, especially with noises of near-duplicate data. To address this shortcoming, we propose a diverse beam search optimized batch active learning method, which explores a graph for every batch construction by expanding the highest-scored samples of a predetermined number. To avoid near duplicate beam branches (very similar beams generated from the same root and similar samples), which is undesirable for lacking diverse representations in the feature space, we design a self-adapted constraint within candidate beams. The proposed method is able to acquire data that can better represent the distribution of the unlabeled pool, and at the same time, be significantly different from existing beams. We observe that the proposed method achieves higher batch performance than the baseline methods on three benchmark datasets.




How to Cite

Sun, J., Zhai, H., Saisho, O., & Takeuchi, S. (2023). Beam Search Optimized Batch Bayesian Active Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 37(5), 6084-6091.



AAAI Technical Track on Humans and AI