Grow-on-Demand: Sparse and Adaptive Expert Expansion for Continual Instruction Tuning

Ying Zhang; Xingyue Guo; Yu Zhao; Xuhui Sui; Baohang Zhou; Xinying Qian; Xiaojie Yuan

doi:10.1609/aaai.v40i34.40077

Authors

Ying Zhang Nankai University
Xingyue Guo Nankai University
Yu Zhao Nankai University
Xuhui Sui Nankai University
Baohang Zhou Tiangong University
Xinying Qian Nankai University
Xiaojie Yuan Nankai University

DOI:

https://doi.org/10.1609/aaai.v40i34.40077

Abstract

Continual instruction tuning aims to incrementally adapt large language models to new tasks without forgetting previously acquired knowledge. Existing approaches often struggle to balance plasticity and stability. Replay-based methods retrain on historical data, which raises privacy concerns. Architecture-based methods allocate task-specific components, resulting in significant parameter growth. To address this, we consider a structure-sharing strategy that enables parameter reuse across similar tasks and expands only when necessary, avoiding any data replay. Specifically, we introduce Grow-on-Demand (GoD-MoE), a parameter-efficient framework that is based on sparse and adaptive expert module expansion for continual instruction tuning. GoD-MoE inserts multiple LoRA-based experts into attention layers and dynamically activates a small subset of experts for each task. To avoid redundant parameter growth, we develop an Expert Demand Detector that determines whether new experts are added, facilitating adaptive structural sharing and minimizing parameter overhead. We conduct comprehensive experiments on the TRACE benchmark, demonstrating that GoD-MoE achieves state-of-the-art performance. Furthermore, it effectively mitigates catastrophic forgetting and even outperforms several advanced replay-based baselines.

Grow-on-Demand: Sparse and Adaptive Expert Expansion for Continual Instruction Tuning

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information