Active Learning on Pre-trained Language Model with Task-Independent Triplet Loss

Authors

  • Seungmin Seo Yonsei University
  • Donghyun Kim Yonsei University
  • Youbin Ahn Yonsei University
  • Kyong-Ho Lee Yonsei University

DOI:

https://doi.org/10.1609/aaai.v36i10.21378

Keywords:

Speech & Natural Language Processing (SNLP)

Abstract

Active learning attempts to maximize a task model’s performance gain by obtaining a set of informative samples from an unlabeled data pool. Previous active learning methods usually rely on specific network architectures or task-dependent sample acquisition algorithms. Moreover, when selecting a batch sample, previous works suffer from insufficient diversity of batch samples because they only consider the informativeness of each sample. This paper proposes a task-independent batch acquisition method using triplet loss to distinguish hard samples in an unlabeled data pool with similar features but difficult to identify labels. To assess the effectiveness of the proposed method, we compare the proposed method with state-of-the-art active learning methods on two tasks, relation extraction and sentence classification. Experimental results show that our method outperforms baselines on the benchmark datasets.

Downloads

Published

2022-06-28

How to Cite

Seo, S., Kim, D., Ahn, Y., & Lee, K.-H. (2022). Active Learning on Pre-trained Language Model with Task-Independent Triplet Loss. Proceedings of the AAAI Conference on Artificial Intelligence, 36(10), 11276-11284. https://doi.org/10.1609/aaai.v36i10.21378

Issue

Section

AAAI Technical Track on Speech and Natural Language Processing