Pay Attention to Target: Relation-Aware Temporal Consistency for Domain Adaptive Video Semantic Segmentation

Huayu Mai; Rui Sun; Yuan Wang; Tianzhu Zhang; Feng Wu

doi:10.1609/aaai.v38i5.28211

Authors

Huayu Mai Deep Space Exploration Laboratory/School of Information Science and Technology, University of Science and Technology of China
Rui Sun Deep Space Exploration Laboratory/School of Information Science and Technology, University of Science and Technology of China
Yuan Wang Deep Space Exploration Laboratory/School of Information Science and Technology, University of Science and Technology of China
Tianzhu Zhang Deep Space Exploration Laboratory/School of Information Science and Technology, University of Science and Technology of China Institute of Artificial Intelligence, Hefei Comprehensive National Science Center
Feng Wu Deep Space Exploration Laboratory/School of Information Science and Technology, University of Science and Technology of China Institute of Artificial Intelligence, Hefei Comprehensive National Science Center

DOI:

https://doi.org/10.1609/aaai.v38i5.28211

Keywords:

CV: Segmentation

Abstract

Video semantic segmentation has achieved conspicuous achievements attributed to the development of deep learning, but suffers from labor-intensive annotated training data gathering. To alleviate the data-hunger issue, domain adaptation approaches are developed in the hope of adapting the model trained on the labeled synthetic videos to the real videos in the absence of annotations. By analyzing the dominant paradigm consistency regularization in the domain adaptation task, we find that the bottlenecks exist in previous methods from the perspective of pseudo-labels. To take full advantage of the information contained in the pseudo-labels and empower more effective supervision signals, we propose a coherent PAT network including a target domain focalizer and relation-aware temporal consistency. The proposed PAT network enjoys several merits. First, the target domain focalizer is responsible for paying attention to the target domain, and increasing the accessibility of pseudo-labels in consistency training. Second, the relation-aware temporal consistency aims at modeling the inter-class consistent relationship across frames to equip the model with effective supervision signals. Extensive experimental results on two challenging benchmarks demonstrate that our method performs favorably against state-of-the-art domain adaptive video semantic segmentation methods.

Pay Attention to Target: Relation-Aware Temporal Consistency for Domain Adaptive Video Semantic Segmentation

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information