DC-Former: Diverse and Compact Transformer for Person Re-identification

Authors

  • Wen Li Ant Group
  • Cheng Zou Ant Group
  • Meng Wang Ant group
  • Furong Xu Ant Group
  • Jianan Zhao Ant Group
  • Ruobing Zheng Ant Group
  • Yuan Cheng Artificial Intelligence Innovation and Incubation Institute, Fudan University
  • Wei Chu Ant Group

DOI:

https://doi.org/10.1609/aaai.v37i2.25226

Keywords:

CV: Image and Video Retrieval, CV: Representation Learning for Vision

Abstract

In person re-identification (ReID) task, it is still challenging to learn discriminative representation by deep learning, due to limited data. Generally speaking, the model will get better performance when increasing the amount of data. The addition of similar classes strengthens the ability of the classifier to identify similar identities, thereby improving the discrimination of representation. In this paper, we propose a Diverse and Compact Transformer (DC-Former) that can achieve a similar effect by splitting embedding space into multiple diverse and compact subspaces. Compact embedding subspace helps model learn more robust and discriminative embedding to identify similar classes. And the fusion of these diverse embeddings containing more fine-grained information can further improve the effect of ReID. Specifically, multiple class tokens are used in vision transformer to represent multiple embedding spaces. Then, a self-diverse constraint (SDC) is applied to these spaces to push them away from each other, which makes each embedding space diverse and compact. Further, a dynamic weight controller (DWC) is further designed for balancing the relative importance among them during training. The experimental results of our method are promising, which surpass previous state-of-the-art methods on several commonly used person ReID benchmarks. Our code is available at https://github.com/ant-research/Diverse-and-Compact-Transformer.

Downloads

Published

2023-06-26

How to Cite

Li, W., Zou, C., Wang, M., Xu, F., Zhao, J., Zheng, R., Cheng, Y., & Chu, W. (2023). DC-Former: Diverse and Compact Transformer for Person Re-identification. Proceedings of the AAAI Conference on Artificial Intelligence, 37(2), 1415-1423. https://doi.org/10.1609/aaai.v37i2.25226

Issue

Section

AAAI Technical Track on Computer Vision II