Cross-Modality Person Re-identification with Memory-Based Contrastive Embedding
DOI:
https://doi.org/10.1609/aaai.v37i1.25116Keywords:
CV: Representation Learning for Vision, CV: Biometrics, Face, Gesture & Pose, CV: Multi-modal Vision, ML: Multimodal LearningAbstract
Visible-infrared person re-identification (VI-ReID) aims to retrieve the person images of the same identity from the RGB to infrared image space, which is very important for real-world surveillance system. In practice, VI-ReID is more challenging due to the heterogeneous modality discrepancy, which further aggravates the challenges of traditional single-modality person ReID problem, i.e., inter-class confusion and intra-class variations. In this paper, we propose an aggregated memory-based cross-modality deep metric learning framework, which benefits from the increasing number of learned modality-aware and modality-agnostic centroid proxies for cluster contrast and mutual information learning. Furthermore, to suppress the modality discrepancy, the proposed cross-modality alignment objective simultaneously utilizes both historical and up-to-date learned cluster proxies for enhanced cross-modality association. Such training mechanism helps to obtain hard positive references through increased diversity of learned cluster proxies, and finally achieves stronger ``pulling close'' effect between cross-modality image features. Extensive experiment results demonstrate the effectiveness of the proposed method, surpassing state-of-the-art works significantly by a large margin on the commonly used VI-ReID datasets.Downloads
Published
2023-06-26
How to Cite
Cheng, D., Wang, X., Wang, N., Wang, Z., Wang, X., & Gao, X. (2023). Cross-Modality Person Re-identification with Memory-Based Contrastive Embedding. Proceedings of the AAAI Conference on Artificial Intelligence, 37(1), 425-432. https://doi.org/10.1609/aaai.v37i1.25116
Issue
Section
AAAI Technical Track on Computer Vision I