Self Correspondence Distillation for End-to-End Weakly-Supervised Semantic Segmentation

Authors

  • Rongtao Xu Institute of Automation, Chinese Academy of Sciences School of Artificial Intelligence,University of Chinese Academy of Sciences
  • Changwei Wang Institute of Automation, Chinese Academy of Sciences School of Artificial Intelligence, University of Chinese Academy of Sciences
  • Jiaxi Sun Institute of Automation, Chinese Academy of Sciences School of Artificial Intelligence, University of Chinese Academy of Sciences
  • Shibiao Xu School of Artificial Intelligence, Beijing University of Posts and Telecommunications
  • Weiliang Meng Institute of Automation, Chinese Academy of Sciences School of Artificial Intelligence,University of Chinese Academy of Sciences
  • Xiaopeng Zhang Institute of Automation, Chinese Academy of Sciences School of Artificial Intelligence, University of Chinese Academy of Sciences

DOI:

https://doi.org/10.1609/aaai.v37i3.25408

Keywords:

CV: Representation Learning for Vision, CV: Scene Analysis & Understanding, ML: Deep Neural Network Algorithms, ML: Representation Learning, ML: Unsupervised & Self-Supervised Learning

Abstract

Efficiently training accurate deep models for weakly supervised semantic segmentation (WSSS) with image-level labels is challenging and important. Recently, end-to-end WSSS methods have become the focus of research due to their high training efficiency. However, current methods suffer from insufficient extraction of comprehensive semantic information, resulting in low-quality pseudo-labels and sub-optimal solutions for end-to-end WSSS. To this end, we propose a simple and novel Self Correspondence Distillation (SCD) method to refine pseudo-labels without introducing external supervision. Our SCD enables the network to utilize feature correspondence derived from itself as a distillation target, which can enhance the network's feature learning process by complementing semantic information. In addition, to further improve the segmentation accuracy, we design a Variation-aware Refine Module to enhance the local consistency of pseudo-labels by computing pixel-level variation. Finally, we present an efficient end-to-end Transformer-based framework (TSCD) via SCD and Variation-aware Refine Module for the accurate WSSS task. Extensive experiments on the PASCAL VOC 2012 and MS COCO 2014 datasets demonstrate that our method significantly outperforms other state-of-the-art methods. Our code is available at https://github.com/Rongtao-Xu/RepresentationLearning/tree/main/SCD-AAAI2023.

Downloads

Published

2023-06-26

How to Cite

Xu, R., Wang, C., Sun, J., Xu, S., Meng, W., & Zhang, X. (2023). Self Correspondence Distillation for End-to-End Weakly-Supervised Semantic Segmentation. Proceedings of the AAAI Conference on Artificial Intelligence, 37(3), 3045-3053. https://doi.org/10.1609/aaai.v37i3.25408

Issue

Section

AAAI Technical Track on Computer Vision III