Multi-Task Test-time Adaptation via Gradient Consensus and Plasticity Constraint

Zhong Ye; Yu Hu; Zhenguo Yang

doi:10.1609/aaai.v40i33.40006

Authors

Zhong Ye Guangdong University of Technology
Yu Hu Guangdong University of Technology
Zhenguo Yang Guangdong University of Technology

DOI:

https://doi.org/10.1609/aaai.v40i33.40006

Abstract

Multi-task test-time adaptation (MT-TTA) aims to adapt pre-trained models to dynamic environments during multi-task inference by leveraging unlabeled test data. This task is particularly challenging as different tasks respond divergently to distribution shifts, and mixed input streams containing both in-distribution (ID) and out-of-distribution (OOD) samples make the models after test-time adaptation prone to catastrophic forgetting of ID knowledge. Although the existing methods like M-TENT extend the classic test entropy minimization (TENT) by minimizing multi-task entropies and employing task-average gradient to adapt a model, it suffers from two key limitations: 1) the average gradient strategy proposed by M-TENT may exacerbate multi-task test-time optimization conflicts, harming individual tasks when gradients are directionally non-consensual; 2) aggressive updates on mixed ID/OOD data cause severe forgetting of ID knowledge. In this paper, we theoretically establish a formal connection between multi-task loss differences and test-time performance under the first-order Taylor analysis, demonstrating that consensual multi-task entropy reductions are likely to increase the performance, while non-consensual ones might decrease the performance. To this end, we propose Consensus-driven Constrained Multi-Task Test-Time Adaptation (CoCo-MT-TTA), consisting of 1) multi-task gradient consensus adaptation, which aligns cross-task gradient directions to seek a consensus gradient; 2) multi-task plasticity-constraint adaptation, which constrains parameter updates using second-moment statistics to preserve ID knowledge. Extensive experiments on benchmark datasets, including CelebA and PlantData, demonstrate that our method achieves an absolute improvement of up to 16.02% in mean ID/OOD F1-score (Mean I&O) under domain shifts over non-adapted models, outperforming the recent baselines.

Multi-Task Test-time Adaptation via Gradient Consensus and Plasticity Constraint

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information