MLAAN: Scaling Supervised Local Learning with Multilaminar Leap Augmented Auxiliary Network

Yuming Zhang; Shouxin Zhang; Peizhe Wang; Feiyu Zhu; Dongzhi Guan; Junhao Su; Jiabin Liu; Changpeng Cai

doi:10.1609/aaai.v39i21.34428

Authors

Yuming Zhang Southeast University
Shouxin Zhang Southeast University
Peizhe Wang Southeast University
Feiyu Zhu University of Shanghai for Science and Technology
Dongzhi Guan Southeast University
Junhao Su Southeast University
Jiabin Liu Southeast University
Changpeng Cai Southeast University

DOI:

https://doi.org/10.1609/aaai.v39i21.34428

Abstract

Deep neural networks (DNNs) typically employ an end-to-end (E2E) training paradigm which presents several challenges, including high GPU memory consumption, inefficiency, and difficulties in model parallelization during training. Recent research has sought to address these issues, with one promising approach being local learning. This method involves partitioning the backbone network into gradient-isolated modules and manually designing auxiliary networks to train these local modules. Existing methods often neglect the interaction of information between local modules, leading to myopic issues and a performance gap compared to E2E training. To address these limitations, we propose the Multilaminar Leap Augmented Auxiliary Network (MLAAN). Specifically, MLAAN comprises Multilaminar Local Modules (MLM) and Leap Augmented Modules (LAM). MLM captures both local and global features through independent and cascaded auxiliary networks, alleviating performance issues caused by insufficient global features. However, overly simplistic auxiliary networks can impede MLM's ability to capture global information. To address this, we further design LAM, an enhanced auxiliary network that uses the Exponential Moving Average (EMA) method to facilitate information exchange between local modules, thereby mitigating the shortsightedness resulting from inadequate interaction. The synergy between MLM and LAM has demonstrated excellent performance. Our experiments on the CIFAR-10, STL-10, SVHN, and ImageNet datasets show that MLAAN can be seamlessly integrated into existing local learning frameworks, significantly enhancing their performance and even surpassing end-to-end (E2E) training methods, while also reducing GPU memory consumption.

MLAAN: Scaling Supervised Local Learning with Multilaminar Leap Augmented Auxiliary Network

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information