Amalgamating Multi-Task Models with Heterogeneous Architectures
DOI:
https://doi.org/10.1609/aaai.v38i14.29459Keywords:
ML: Transfer, Domain Adaptation, Multi-Task Learning, KRR: Knowledge Acquisition, ML: Ensemble MethodsAbstract
Multi-task learning (MTL) is essential for real-world applications that handle multiple tasks simultaneously, such as selfdriving cars. MTL methods improve the performance of all tasks by utilizing information across tasks to learn a robust shared representation. However, acquiring sufficient labeled data tends to be extremely expensive, especially when having to support many tasks. Recently, Knowledge Amalgamation (KA) has emerged as an effective strategy for addressing the lack of labels by instead learning directly from pretrained models (teachers). KA learns one unified multi-task student that masters all tasks across all teachers. Existing KA for MTL works are limited to teachers with identical architectures, and thus propose layer-to-layer based approaches. Unfortunately, in practice, teachers may have heterogeneous architectures; their layers may not be aligned and their dimensionalities or scales may be incompatible. Amalgamating multi-task teachers with heterogeneous architectures remains an open problem. For this, we design Versatile Common Feature Consolidator (VENUS), the first solution to this problem. VENUS fuses knowledge from the shared representations of each teacher into one unified generalized representation for all tasks. Specifically, we design the Feature Consolidator network that leverages an array of teacher-specific trainable adaptors. These adaptors enable the student to learn from multiple teachers, even if they have incompatible learned representations. We demonstrate that VENUS outperforms five alternative methods on numerous benchmark datasets across a broad spectrum of experiments.Downloads
Published
2024-03-24
How to Cite
Thadajarassiri, J., Gerych, W., Kong, X., & Rundensteiner, E. (2024). Amalgamating Multi-Task Models with Heterogeneous Architectures. Proceedings of the AAAI Conference on Artificial Intelligence, 38(14), 15346-15354. https://doi.org/10.1609/aaai.v38i14.29459
Issue
Section
AAAI Technical Track on Machine Learning V