DeMo: Decoupled Feature-Based Mixture of Experts for Multi-Modal Object Re-Identification

Authors

  • Yuhao Wang School of Future Technology, School of Artificial Intelligence, Dalian University of Technology
  • Yang Liu School of Future Technology, School of Artificial Intelligence, Dalian University of Technology Anhui Provincial Key Laboratory of Multimodal Cognitive Computation, Anhui University
  • Aihua Zheng School of Artificial Intelligence, Anhui University Anhui Provincial Key Laboratory of Multimodal Cognitive Computation, Anhui University
  • Pingping Zhang School of Future Technology, School of Artificial Intelligence, Dalian University of Technology Anhui Provincial Key Laboratory of Multimodal Cognitive Computation, Anhui University

DOI:

https://doi.org/10.1609/aaai.v39i8.32878

Abstract

Multi-modal object Re-IDentification (ReID) aims to retrieve specific objects by combining complementary information from multiple modalities. Existing multi-modal object ReID methods primarily focus on the fusion of heterogeneous features. However, they often overlook the dynamic quality changes in multi-modal imaging. In addition, the shared information between different modalities can weaken modality-specific information. To address these issues, we propose a novel feature learning framework called DeMo for multi-modal object ReID, which adaptively balances decoupled features using a mixture of experts. To be specific, we first deploy a Patch-Integrated Feature Extractor (PIFE) to extract multi-granularity and multi-modal features. Then, we introduce a Hierarchical Decoupling Module (HDM) to decouple multi-modal features into non-overlapping forms, preserving the modality uniqueness and increasing the feature diversity. Finally, we propose an Attention-Triggered Mixture of Experts (ATMoE), which replaces traditional gating with dynamic attention weights derived from decoupled features. With these modules, our DeMo can generate more robust multi-modal features. Extensive experiments on three object ReID benchmarks verify the effectiveness of our methods.

Downloads

Published

2025-04-11

How to Cite

Wang, Y., Liu, Y., Zheng, A., & Zhang, P. (2025). DeMo: Decoupled Feature-Based Mixture of Experts for Multi-Modal Object Re-Identification. Proceedings of the AAAI Conference on Artificial Intelligence, 39(8), 8141–8149. https://doi.org/10.1609/aaai.v39i8.32878

Issue

Section

AAAI Technical Track on Computer Vision VII