Sparse Tuning Enhances Plasticity in PTM-based Continual Learning

Authors

  • Huan Zhang Wuhan University
  • Shenghua Fan Wuhan University
  • Shuyu Dong Central China Normal University
  • Yujin Zheng Wuhan University
  • Dingwen Wang Wuhan University
  • Fan Lyu Institute of automation, Chinese Academy of Sciences

DOI:

https://doi.org/10.1609/aaai.v40i33.40050

Abstract

Continual Learning with Pre-trained Models holds great promise for efficient adaptation across sequential tasks. However, most existing approaches freeze PTMs and rely on auxiliary modules like prompts or adapters, limiting model plasticity and leading to suboptimal generalization when facing significant distribution shifts. While full fine-tuning can improve adaptability, it risks disrupting crucial pre-trained knowledge. In this paper, we propose Mutual Information-guided Sparse Tuning (MIST), a plug-and-play method that selectively updates a small subset of PTM parameters, less than 5%, based on sensitivity to mutual information objectives. MIST enables effective task-specific adaptation while preserving generalization. To further reduce interference, we introduce strong sparsity regularization by randomly dropping gradients during tuning, resulting in fewer than 0.5% of parameters being updated per step. Applied before standard freeze-based methods, MIST consistently boosts performance across diverse continual learning benchmarks. Experiments show that integrating our method into multiple baselines yields significant performance gains.

Published

2026-03-14

How to Cite

Zhang, H., Fan, S., Dong, S., Zheng, Y., Wang, D., & Lyu, F. (2026). Sparse Tuning Enhances Plasticity in PTM-based Continual Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 40(33), 28230–28238. https://doi.org/10.1609/aaai.v40i33.40050

Issue

Section

AAAI Technical Track on Machine Learning X