DeMo: Decoupled Feature-Based Mixture of Experts for Multi-Modal Object Re-Identification
DOI:
https://doi.org/10.1609/aaai.v39i8.32878Abstract
Multi-modal object Re-IDentification (ReID) aims to retrieve specific objects by combining complementary information from multiple modalities. Existing multi-modal object ReID methods primarily focus on the fusion of heterogeneous features. However, they often overlook the dynamic quality changes in multi-modal imaging. In addition, the shared information between different modalities can weaken modality-specific information. To address these issues, we propose a novel feature learning framework called DeMo for multi-modal object ReID, which adaptively balances decoupled features using a mixture of experts. To be specific, we first deploy a Patch-Integrated Feature Extractor (PIFE) to extract multi-granularity and multi-modal features. Then, we introduce a Hierarchical Decoupling Module (HDM) to decouple multi-modal features into non-overlapping forms, preserving the modality uniqueness and increasing the feature diversity. Finally, we propose an Attention-Triggered Mixture of Experts (ATMoE), which replaces traditional gating with dynamic attention weights derived from decoupled features. With these modules, our DeMo can generate more robust multi-modal features. Extensive experiments on three object ReID benchmarks verify the effectiveness of our methods.Downloads
Published
2025-04-11
How to Cite
Wang, Y., Liu, Y., Zheng, A., & Zhang, P. (2025). DeMo: Decoupled Feature-Based Mixture of Experts for Multi-Modal Object Re-Identification. Proceedings of the AAAI Conference on Artificial Intelligence, 39(8), 8141–8149. https://doi.org/10.1609/aaai.v39i8.32878
Issue
Section
AAAI Technical Track on Computer Vision VII