Cross-Modality Person Re-identification with Memory-Based Contrastive Embedding

Authors

  • De Cheng Xidian University
  • Xiaolong Wang Xidian university
  • Nannan Wang Xidian University
  • Zhen Wang Zhejiang Lab
  • Xiaoyu Wang University of Science and Technology of China
  • Xinbo Gao Chongqing University of Posts and Telecommunications

DOI:

https://doi.org/10.1609/aaai.v37i1.25116

Keywords:

CV: Representation Learning for Vision, CV: Biometrics, Face, Gesture & Pose, CV: Multi-modal Vision, ML: Multimodal Learning

Abstract

Visible-infrared person re-identification (VI-ReID) aims to retrieve the person images of the same identity from the RGB to infrared image space, which is very important for real-world surveillance system. In practice, VI-ReID is more challenging due to the heterogeneous modality discrepancy, which further aggravates the challenges of traditional single-modality person ReID problem, i.e., inter-class confusion and intra-class variations. In this paper, we propose an aggregated memory-based cross-modality deep metric learning framework, which benefits from the increasing number of learned modality-aware and modality-agnostic centroid proxies for cluster contrast and mutual information learning. Furthermore, to suppress the modality discrepancy, the proposed cross-modality alignment objective simultaneously utilizes both historical and up-to-date learned cluster proxies for enhanced cross-modality association. Such training mechanism helps to obtain hard positive references through increased diversity of learned cluster proxies, and finally achieves stronger ``pulling close'' effect between cross-modality image features. Extensive experiment results demonstrate the effectiveness of the proposed method, surpassing state-of-the-art works significantly by a large margin on the commonly used VI-ReID datasets.

Downloads

Published

2023-06-26

How to Cite

Cheng, D., Wang, X., Wang, N., Wang, Z., Wang, X., & Gao, X. (2023). Cross-Modality Person Re-identification with Memory-Based Contrastive Embedding. Proceedings of the AAAI Conference on Artificial Intelligence, 37(1), 425-432. https://doi.org/10.1609/aaai.v37i1.25116

Issue

Section

AAAI Technical Track on Computer Vision I