Towards Versatile Pedestrian Detector with Multisensory-Matching and Multispectral Recalling Memory


  • Jung Uk Kim KAIST
  • Sungjune Park KAIST
  • Yong Man Ro KAIST



Computer Vision (CV)


Recently, automated surveillance cameras can change a visible sensor and a thermal sensor for all-day operation. However, existing single-modal pedestrian detectors mainly focus on detecting pedestrians in only one specific modality (i.e., visible or thermal), so they cannot cope with other modal inputs. In addition, recent multispectral pedestrian detectors have shown remarkable performance by adopting multispectral modalities, but they also have limitations in practical applications (e.g., different Field-of-View (FoV) and frame rate). In this paper, we introduce a versatile pedestrian detector that shows robust detection performance in any single modality. We propose a multisensory-matching contrastive loss to reduce the difference between the visual representation of pedestrians in the visible and thermal modalities. Moreover, for the robust detection on a single modality, we design a Multispectral Recalling (MSR) Memory. The MSR Memory enhances the visual representation of the single modal features by recalling that of the multispectral modalities. To guide the MSR Memory to store the multispectral modal contexts, we introduce a multispectral recalling loss. It enables the pedestrian detector to encode more discriminative features with a single input modality. We believe our method is a step forward detector that can be applied to a variety of real-world applications. The comprehensive experimental results verify the effectiveness of the proposed method.




How to Cite

Kim, J. U., Park, S., & Ro, Y. M. (2022). Towards Versatile Pedestrian Detector with Multisensory-Matching and Multispectral Recalling Memory. Proceedings of the AAAI Conference on Artificial Intelligence, 36(1), 1157-1165.



AAAI Technical Track on Computer Vision I