Semi-Supervised Knowledge Amalgamation for Sequence Classification
DOI:
https://doi.org/10.1609/aaai.v35i11.17185Keywords:
Time-Series/Data Streams, Semi-Supervised Learning, Ensemble Methods, Knowledge AcquisitionAbstract
Sequence classification is essential for domains from medical diagnosis to online advertising. In these settings, data are typically proprietary, and annotations are expensive to acquire. Often times, so few annotations are available that training a robust model from scratch is impractical. Recently, knowledge amalgamation (KA) has emerged as a promising strategy for training models without this hard-to-come-by labeled training dataset. To achieve this, KA methods combine the knowledge of multiple pre-trained teacher models (trained on different classification tasks and proprietary datasets) into one student model that becomes an expert on the union of all teachers’ classes. However, we demonstrate that the state-of-the-art solutions fail in the presence of overconfident teachers, which make confident but incorrect predictions for instances from classes upon which they were not trained. Additionally, to-date no work has explored KA for sequence models. Therefore, we propose and then solve the open problem of semi-supervised KA for sequence classification (SKA). Our SKA approach first learns to estimate how trustworthy each teacher is for a given instance, then rescales the predicted probabilities from all teachers to supervise a student model. Our solution overcomes overconfident teachers through careful use of a very small amount of labeled instances. We demonstrate that this approach beats eight state-of-the-art alternatives on four real-world datasets by on average 15% in accuracy with as little as 2% of training data being annotated.Downloads
Published
2021-05-18
How to Cite
Thadajarassiri, J., Hartvigsen, T., Kong, X., & Rundensteiner, E. A. (2021). Semi-Supervised Knowledge Amalgamation for Sequence Classification. Proceedings of the AAAI Conference on Artificial Intelligence, 35(11), 9859-9867. https://doi.org/10.1609/aaai.v35i11.17185
Issue
Section
AAAI Technical Track on Machine Learning IV