Relation-Guided Spatial Attention and Temporal Refinement for Video-Based Person Re-Identification

Authors

  • Xingze Li University of Science and Technology of China
  • Wengang Zhou University of Science and Technology of China
  • Yun Zhou University of Science and Technology of China
  • Houqiang Li University of Science and Technology of China

DOI:

https://doi.org/10.1609/aaai.v34i07.6807

Abstract

Video-based person re-identification has received considerable attention in recent years due to its significant application in video surveillance. Compared with image-based person re-identification, video-based person re-identification is characterized by a much richer context, which raises the significance of identifying informative regions and fusing the temporal information across frames. In this paper, we propose two relation-guided modules to learn reinforced feature representations for effective re-identification. First, a relation-guided spatial attention (RGSA) module is designed to explore the discriminative regions globally. The weight at each position is determined by its feature as well as the relation features from other positions, revealing the dependence between local and global contents. Based on the adaptively weighted frame-level feature, then, a relation-guided temporal refinement (RGTR) module is proposed to further refine the feature representations across frames. The learned relation information via the RGTR module enables the individual frames to complement each other in an aggregation manner, leading to robust video-level feature representations. Extensive experiments on four prevalent benchmarks verify the state-of-the-art performance of the proposed method.

Downloads

Published

2020-04-03

How to Cite

Li, X., Zhou, W., Zhou, Y., & Li, H. (2020). Relation-Guided Spatial Attention and Temporal Refinement for Video-Based Person Re-Identification. Proceedings of the AAAI Conference on Artificial Intelligence, 34(07), 11434-11441. https://doi.org/10.1609/aaai.v34i07.6807

Issue

Section

AAAI Technical Track: Vision