Distillation-Guided Structural Transfer for Continual Learning Beyond Sparse Distributed Memory
DOI:
https://doi.org/10.1609/aaai.v40i32.39960Abstract
Sparse neural systems are gaining traction for efficient continual learning due to their modularity and low interference. Architectures like Sparse Distributed Memory Multi-Layer Perceptrons (SDMLP) construct task-specific subnetworks via Top-K activation and have shown resilience against catastrophic forgetting. However, their rigid modularity poses two fundamental challenges: (1) the isolation of sparse subnetworks severely limits cross-task knowledge reuse; and (2) increased sparsity reduces interference but often degrades performance due to constrained feature sharing.We propose Selective Subnetwork Distillation (SSD), a structurally guided continual learning framework that treats distillation not as a regularizer, but as a topology-aligned information conduit. By identifying neurons with high activation frequency, SSD selectively distills knowledge within previous Top-K subnetworks and output logits—without requiring replay or task labels—preserving both sparsity and functional specialization.Unlike conventional distillation, SSD operates under hard modular constraints and enables structural realignment without altering the sparse architecture.While our method is validated on SDMLP, its structure-aligned mechanism has the potential to generalize to other sparse networks as a plug-in module for promoting representation sharing.Comprehensive experiments on Split CIFAR-10, CIFAR-100, and MNIST demonstrate that SSD improves accuracy, retention, and manifold coverage, offering a structurally grounded solution to sparse continual learning.Published
2026-03-14
How to Cite
Xue, H., Ran, X., Li, Y., Xu, Q., Li, E., Xu, Y., & Zhang, Q. (2026). Distillation-Guided Structural Transfer for Continual Learning Beyond Sparse Distributed Memory. Proceedings of the AAAI Conference on Artificial Intelligence, 40(32), 27423–27431. https://doi.org/10.1609/aaai.v40i32.39960
Issue
Section
AAAI Technical Track on Machine Learning IX