Grow-on-Demand: Sparse and Adaptive Expert Expansion for Continual Instruction Tuning

Authors

  • Ying Zhang Nankai University
  • Xingyue Guo Nankai University
  • Yu Zhao Nankai University
  • Xuhui Sui Nankai University
  • Baohang Zhou Tiangong University
  • Xinying Qian Nankai University
  • Xiaojie Yuan Nankai University

DOI:

https://doi.org/10.1609/aaai.v40i34.40077

Abstract

Continual instruction tuning aims to incrementally adapt large language models to new tasks without forgetting previously acquired knowledge. Existing approaches often struggle to balance plasticity and stability. Replay-based methods retrain on historical data, which raises privacy concerns. Architecture-based methods allocate task-specific components, resulting in significant parameter growth. To address this, we consider a structure-sharing strategy that enables parameter reuse across similar tasks and expands only when necessary, avoiding any data replay. Specifically, we introduce Grow-on-Demand (GoD-MoE), a parameter-efficient framework that is based on sparse and adaptive expert module expansion for continual instruction tuning. GoD-MoE inserts multiple LoRA-based experts into attention layers and dynamically activates a small subset of experts for each task. To avoid redundant parameter growth, we develop an Expert Demand Detector that determines whether new experts are added, facilitating adaptive structural sharing and minimizing parameter overhead. We conduct comprehensive experiments on the TRACE benchmark, demonstrating that GoD-MoE achieves state-of-the-art performance. Furthermore, it effectively mitigates catastrophic forgetting and even outperforms several advanced replay-based baselines.

Downloads

Published

2026-03-14

How to Cite

Zhang, Y., Guo, X., Zhao, Y., Sui, X., Zhou, B., Qian, X., & Yuan, X. (2026). Grow-on-Demand: Sparse and Adaptive Expert Expansion for Continual Instruction Tuning. Proceedings of the AAAI Conference on Artificial Intelligence, 40(34), 28474–28482. https://doi.org/10.1609/aaai.v40i34.40077

Issue

Section

AAAI Technical Track on Machine Learning XI