Sparse Tuning Enhances Plasticity in PTM-based Continual Learning

Huan Zhang; Shenghua Fan; Shuyu Dong; Yujin Zheng; Dingwen Wang; Fan Lyu

doi:10.1609/aaai.v40i33.40050

Authors

Huan Zhang Wuhan University
Shenghua Fan Wuhan University
Shuyu Dong Central China Normal University
Yujin Zheng Wuhan University
Dingwen Wang Wuhan University
Fan Lyu Institute of automation, Chinese Academy of Sciences

DOI:

https://doi.org/10.1609/aaai.v40i33.40050

Abstract

Continual Learning with Pre-trained Models holds great promise for efficient adaptation across sequential tasks. However, most existing approaches freeze PTMs and rely on auxiliary modules like prompts or adapters, limiting model plasticity and leading to suboptimal generalization when facing significant distribution shifts. While full fine-tuning can improve adaptability, it risks disrupting crucial pre-trained knowledge. In this paper, we propose Mutual Information-guided Sparse Tuning (MIST), a plug-and-play method that selectively updates a small subset of PTM parameters, less than 5%, based on sensitivity to mutual information objectives. MIST enables effective task-specific adaptation while preserving generalization. To further reduce interference, we introduce strong sparsity regularization by randomly dropping gradients during tuning, resulting in fewer than 0.5% of parameters being updated per step. Applied before standard freeze-based methods, MIST consistently boosts performance across diverse continual learning benchmarks. Experiments show that integrating our method into multiple baselines yields significant performance gains.

Sparse Tuning Enhances Plasticity in PTM-based Continual Learning

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information