TOP-ReID: Multi-Spectral Object Re-identification with Token Permutation

Authors

  • Yuhao Wang School of Future Technology, School of Artificial Intelligence, Dalian University of Technology
  • Xuehu Liu School of Computer Science and Artificial Intelligence, Wuhan University of Technology
  • Pingping Zhang School of Future Technology, School of Artificial Intelligence, Dalian University of Technology
  • Hu Lu School of Computer Science and Communication Engineering, Jiangsu University
  • Zhengzheng Tu School of Computer Science and Technology, Anhui University
  • Huchuan Lu School of Future Technology, School of Artificial Intelligence, Dalian University of Technology

DOI:

https://doi.org/10.1609/aaai.v38i6.28388

Keywords:

CV: Scene Analysis & Understanding, CV: Multi-modal Vision

Abstract

Multi-spectral object Re-identification (ReID) aims to retrieve specific objects by leveraging complementary information from different image spectra. It delivers great advantages over traditional single-spectral ReID in complex visual environment. However, the significant distribution gap among different image spectra poses great challenges for effective multi-spectral feature representations. In addition, most of current Transformer-based ReID methods only utilize the global feature of class tokens to achieve the holistic retrieval, ignoring the local discriminative ones. To address the above issues, we step further to utilize all the tokens of Transformers and propose a cyclic token permutation framework for multi-spectral object ReID, dubbled TOP-ReID. More specifically, we first deploy a multi-stream deep network based on vision Transformers to preserve distinct information from different image spectra. Then, we propose a Token Permutation Module (TPM) for cyclic multi-spectral feature aggregation. It not only facilitates the spatial feature alignment across different image spectra, but also allows the class token of each spectrum to perceive the local details of other spectra. Meanwhile, we propose a Complementary Reconstruction Module (CRM), which introduces dense token-level reconstruction constraints to reduce the distribution gap across different image spectra. With the above modules, our proposed framework can generate more discriminative multi-spectral features for robust object ReID. Extensive experiments on three ReID benchmarks (i.e., RGBNT201, RGBNT100 and MSVR310) verify the effectiveness of our methods. The code is available at https://github.com/924973292/TOP-ReID.

Published

2024-03-24

How to Cite

Wang, Y., Liu, X., Zhang, P., Lu, H., Tu, Z., & Lu, H. (2024). TOP-ReID: Multi-Spectral Object Re-identification with Token Permutation. Proceedings of the AAAI Conference on Artificial Intelligence, 38(6), 5758-5766. https://doi.org/10.1609/aaai.v38i6.28388

Issue

Section

AAAI Technical Track on Computer Vision V