Semi-Supervised Knowledge Amalgamation for Sequence Classification

Jidapa Thadajarassiri; Thomas Hartvigsen; Xiangnan Kong; Elke A Rundensteiner

doi:10.1609/aaai.v35i11.17185

Authors

Jidapa Thadajarassiri Worcester Polytechnic Institute
Thomas Hartvigsen Worcester Polytechnic Institute
Xiangnan Kong Worcester Polytechnic Institute
Elke A Rundensteiner Worcester Polytechnic Institute

DOI:

https://doi.org/10.1609/aaai.v35i11.17185

Keywords:

Time-Series/Data Streams, Semi-Supervised Learning, Ensemble Methods, Knowledge Acquisition

Abstract

Sequence classification is essential for domains from medical diagnosis to online advertising. In these settings, data are typically proprietary, and annotations are expensive to acquire. Often times, so few annotations are available that training a robust model from scratch is impractical. Recently, knowledge amalgamation (KA) has emerged as a promising strategy for training models without this hard-to-come-by labeled training dataset. To achieve this, KA methods combine the knowledge of multiple pre-trained teacher models (trained on different classification tasks and proprietary datasets) into one student model that becomes an expert on the union of all teachers’ classes. However, we demonstrate that the state-of-the-art solutions fail in the presence of overconfident teachers, which make confident but incorrect predictions for instances from classes upon which they were not trained. Additionally, to-date no work has explored KA for sequence models. Therefore, we propose and then solve the open problem of semi-supervised KA for sequence classification (SKA). Our SKA approach first learns to estimate how trustworthy each teacher is for a given instance, then rescales the predicted probabilities from all teachers to supervise a student model. Our solution overcomes overconfident teachers through careful use of a very small amount of labeled instances. We demonstrate that this approach beats eight state-of-the-art alternatives on four real-world datasets by on average 15% in accuracy with as little as 2% of training data being annotated.

Semi-Supervised Knowledge Amalgamation for Sequence Classification

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription