GigaMoE: Sparsity-Guided Mixture of Experts for Efficient Gigapixel Object Detection
DOI:
https://doi.org/10.1609/aaai.v40i21.38810Abstract
Object detection in High-Resolution Wide (HRW) shots, or gigapixel images, presents unique challenges due to extreme object sparsity and vast scale variations. State-of-the-art methods like SparseFormer have pioneered sparse processing by selectively focusing on important regions, yet they apply a uniform computational model to all selected regions, overlooking their intrinsic complexity differences. This leads to a suboptimal trade-off between performance and efficiency. In this paper, we introduce GigaMoE, a novel backbone architecture that pioneers adaptive computation for this domain by replacing the standard Feed-Forward Networks (FFNs) with a Mixture-of-Experts (MoE) module. Our architecture first employs a shared expert to provide a robust feature baseline for all selected regions. Upon this foundation, our core innovation---a novel Sparsity-Guided Routing mechanism---insightfully repurposes importance scores from the sparse backbone to provide a "computational bonus,'' dynamically engaging a variable number of specialized experts based on content complexity. The entire system is trained efficiently via a loss-free load-balancing technique, eliminating the need for cumbersome auxiliary losses. Extensive experiments show that GigaMoE sets a new state-of-the-art on the PANDA benchmark, improving detection accuracy by 1.1% over SparseFormer while simultaneously reducing the computational cost (FLOPs) by a remarkable 32.3%.Published
2026-03-14
How to Cite
Li, X., Li, W., Wang, Y., Lyu, C., Lin, H., Ding, G., & Guo, Y. (2026). GigaMoE: Sparsity-Guided Mixture of Experts for Efficient Gigapixel Object Detection. Proceedings of the AAAI Conference on Artificial Intelligence, 40(21), 17553–17561. https://doi.org/10.1609/aaai.v40i21.38810
Issue
Section
AAAI Technical Track on Humans and AI