An Information-theoretic Multi-task Representation Learning Framework for Natural Language Understanding

Authors

  • Dou Hu Institute of Information Engineering, Chinese Academy of Sciences School of Cyber Security, University of Chinese Academy of Sciences
  • Lingwei Wei Institute of Information Engineering, Chinese Academy of Sciences
  • Wei Zhou Institute of Information Engineering, Chinese Academy of Sciences
  • Songlin Hu Institute of Information Engineering, Chinese Academy of Sciences School of Cyber Security, University of Chinese Academy of Sciences

DOI:

https://doi.org/10.1609/aaai.v39i16.33899

Abstract

This paper proposes a new principled multi-task representation learning framework (InfoMTL) to extract noise-invariant sufficient representations for all tasks. It ensures sufficiency of shared representations for all tasks and mitigates the negative effect of redundant features, which can enhance language understanding of pre-trained language models (PLMs) under the multi-task paradigm. Firstly, a shared information maximization principle is proposed to learn more sufficient shared representations for all target tasks. It can avoid the insufficiency issue arising from representation compression in the multi-task paradigm. Secondly, a task-specific information minimization principle is designed to mitigate the negative effect of potential redundant features in the input for each task. It can compress task-irrelevant redundant information and preserve necessary information relevant to the target for multi-task prediction. Experiments on six classification benchmarks show that our method outperforms 12 comparative multi-task methods under the same multi-task settings, especially in data-constrained and noisy scenarios. Extensive experiments demonstrate that the learned representations are more sufficient, data-efficient, and robust.

Published

2025-04-11

How to Cite

Hu, D., Wei, L., Zhou, W., & Hu, S. (2025). An Information-theoretic Multi-task Representation Learning Framework for Natural Language Understanding. Proceedings of the AAAI Conference on Artificial Intelligence, 39(16), 17276–17286. https://doi.org/10.1609/aaai.v39i16.33899

Issue

Section

AAAI Technical Track on Machine Learning II