Edge-Aware Guidance Fusion Network for RGB–Thermal Scene Parsing

Authors

  • Wujie Zhou Zhejiang University of Science and Technology
  • Shaohua Dong Zhejiang university of science and technology
  • Caie Xu Zhejiang University of Science and Technology
  • Yaguan Qian Zhejiang University of Science and Technology

DOI:

https://doi.org/10.1609/aaai.v36i3.20269

Keywords:

Computer Vision (CV)

Abstract

RGB–thermal scene parsing has recently attracted increasing research interest in the field of computer vision. However, most existing methods fail to perform good boundary extraction for prediction maps and cannot fully use high-level features. In addition, these methods simply fuse the features from RGB and thermal modalities but are unable to obtain comprehensive fused features. To address these problems, we propose an edge-aware guidance fusion network (EGFNet) for RGB–thermal scene parsing. First, we introduce a prior edge map generated using the RGB and thermal images to capture detailed information in the prediction map and then embed the prior edge information in the feature maps. To effectively fuse the RGB and thermal information, we propose a multimodal fusion module that guarantees adequate cross-modal fusion. Considering the importance of high-level semantic information, we propose a global information module and a semantic information module to extract rich semantic information from the high-level features. For decoding, we use simple elementwise addition for cascaded feature fusion. Finally, to improve the parsing accuracy, we apply multitask deep supervision to the semantic and boundary maps. Extensive experiments were performed on benchmark datasets to demonstrate the effectiveness of the proposed EGFNet and its superior performance compared with state-of-the-art methods. The code and results can be found at https://github.com/ShaohuaDong2021/EGFNet.

Downloads

Published

2022-06-28

How to Cite

Zhou, W., Dong, S., Xu, C., & Qian, Y. (2022). Edge-Aware Guidance Fusion Network for RGB–Thermal Scene Parsing. Proceedings of the AAAI Conference on Artificial Intelligence, 36(3), 3571-3579. https://doi.org/10.1609/aaai.v36i3.20269

Issue

Section

AAAI Technical Track on Computer Vision III