CSL: Class-Agnostic Structure-Constrained Learning for Segmentation Including the Unseen

Authors

  • Hao Zhang University of Illinois at Urbana-Champaign
  • Fang Li University of Illinois at Urbana-Champaign
  • Lu Qi The University of California, Merced
  • Ming-Hsuan Yang University of California at Merced
  • Narendra Ahuja University of Illinois at Urbana-Champaign, USA

DOI:

https://doi.org/10.1609/aaai.v38i7.28535

Keywords:

CV: Segmentation, CV: Representation Learning for Vision

Abstract

Addressing Out-Of-Distribution (OOD) Segmentation and Zero-Shot Semantic Segmentation (ZS3) is challenging, necessitating segmenting unseen classes. Existing strategies adapt the class-agnostic Mask2Former (CA-M2F) tailored to specific tasks. However, these methods cater to singular tasks, demand training from scratch, and we demonstrate certain deficiencies in CA-M2F, which affect performance. We propose the Class-Agnostic Structure-Constrained Learning (CSL), a plug-in framework that can integrate with existing methods, thereby embedding structural constraints and achieving performance gain, including the unseen, specifically OOD, ZS3, and domain adaptation (DA) tasks. There are two schemes for CSL to integrate with existing methods (1) by distilling knowledge from a base teacher network, enforcing constraints across training and inference phrases, or (2) by leveraging established models to obtain per-pixel distributions without retraining, appending constraints during the inference phase. Our soft assignment and mask split methodologies enhance OOD object segmentation. Empirical evaluations demonstrate CSL's prowess in boosting the performance of existing algorithms spanning OOD segmentation, ZS3, and DA segmentation, consistently transcending the state-of-art across all three tasks.

Published

2024-03-24

How to Cite

Zhang, H., Li, F., Qi, L., Yang, M.-H., & Ahuja, N. (2024). CSL: Class-Agnostic Structure-Constrained Learning for Segmentation Including the Unseen. Proceedings of the AAAI Conference on Artificial Intelligence, 38(7), 7078-7086. https://doi.org/10.1609/aaai.v38i7.28535

Issue

Section

AAAI Technical Track on Computer Vision VI