Cross-Modality Earth Mover’s Distance for Visible Thermal Person Re-identification

Authors

  • Yongguo Ling Xiamen University
  • Zhun Zhong University of Trento
  • Zhiming Luo Xiamen University
  • Fengxiang Yang Xiamen University
  • Donglin Cao Xiamen University
  • Yaojin Lin Minnan Normal University
  • Shaozi Li Xiamen University, China
  • Nicu Sebe University of Trento

DOI:

https://doi.org/10.1609/aaai.v37i2.25250

Keywords:

CV: Image and Video Retrieval, CV: Multi-modal Vision, ML: Transfer, Domain Adaptation, Multi-Task Learning

Abstract

Visible thermal person re-identification (VT-ReID) suffers from inter-modality discrepancy and intra-identity variations. Distribution alignment is a popular solution for VT-ReID, however, it is usually restricted to the influence of the intra-identity variations. In this paper, we propose the Cross-Modality Earth Mover's Distance (CM-EMD) that can alleviate the impact of the intra-identity variations during modality alignment. CM-EMD selects an optimal transport strategy and assigns high weights to pairs that have a smaller intra-identity variation. In this manner, the model will focus on reducing the inter-modality discrepancy while paying less attention to intra-identity variations, leading to a more effective modality alignment. Moreover, we introduce two techniques to improve the advantage of CM-EMD. First, Cross-Modality Discrimination Learning (CM-DL) is designed to overcome the discrimination degradation problem caused by modality alignment. By reducing the ratio between intra-identity and inter-identity variances, CM-DL leads the model to learn more discriminative representations. Second, we construct the Multi-Granularity Structure (MGS), enabling us to align modalities from both coarse- and fine-grained levels with the proposed CM-EMD. Extensive experiments show the benefits of the proposed CM-EMD and its auxiliary techniques (CM-DL and MGS). Our method achieves state-of-the-art performance on two VT-ReID benchmarks.

Downloads

Published

2023-06-26

How to Cite

Ling, Y., Zhong, Z., Luo, Z., Yang, F., Cao, D., Lin, Y., Li, S., & Sebe, N. (2023). Cross-Modality Earth Mover’s Distance for Visible Thermal Person Re-identification. Proceedings of the AAAI Conference on Artificial Intelligence, 37(2), 1631-1639. https://doi.org/10.1609/aaai.v37i2.25250

Issue

Section

AAAI Technical Track on Computer Vision II