Learning Conflict-Noticed Architecture for Multi-Task Learning

Authors

  • Zhixiong Yue Southern University of Science and Technology University of Technology Sydney
  • Yu Zhang Southern University of Science and Technology Peng Cheng Laboratory, Shenzhen, China
  • Jie Liang University of Technology Sydney

DOI:

https://doi.org/10.1609/aaai.v37i9.26312

Keywords:

ML: Transfer, Domain Adaptation, Multi-Task Learning, ML: Auto ML and Hyperparameter Tuning, ML: Deep Neural Architectures, CV: Learning & Optimization for CV

Abstract

Multi-task learning has been widely used in many applications to enable more efficient learning by sharing part of the architecture across multiple tasks. However, a major challenge is the gradient conflict when optimizing the shared parameters, where the gradients of different tasks could have opposite directions. Directly averaging those gradients will impair the performance of some tasks and cause negative transfer. Different from most existing works that manipulate gradients to mitigate the gradient conflict, in this paper, we address this problem from the perspective of architecture learning and propose a Conflict-Noticed Architecture Learning (CoNAL) method to alleviate the gradient conflict by learning architectures. By introducing purely-specific modules specific to each task in the search space, the CoNAL method can automatically learn when to switch to purely-specific modules in the tree-structured network architectures when the gradient conflict occurs. To handle multi-task problems with a large number of tasks, we propose a progressive extension of the CoNAL method. Extensive experiments on computer vision, natural language processing, and reinforcement learning benchmarks demonstrate the effectiveness of the proposed methods.

Downloads

Published

2023-06-26

How to Cite

Yue, Z., Zhang, Y., & Liang, J. (2023). Learning Conflict-Noticed Architecture for Multi-Task Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 37(9), 11078-11086. https://doi.org/10.1609/aaai.v37i9.26312

Issue

Section

AAAI Technical Track on Machine Learning IV