How to Save your Annotation Cost for Panoptic Segmentation?

Authors

  • Xuefeng Du Xi'an Jiaotong University
  • ChenHan Jiang Huawei Noah's Ark Lab
  • Hang Xu Huawei Noah's Ark Lab
  • Gengwei Zhang Sun Yat-Sen University
  • Zhenguo Li Huawei Noah's Ark Lab

DOI:

https://doi.org/10.1609/aaai.v35i2.16216

Keywords:

Segmentation

Abstract

How to properly reduce the annotation cost for panoptic segmentation? How to leverage and optimize the cost-quality trade-off for training data and model? These questions are key challenges towards a label-efficient and scalable panoptic segmentation system due to its expensive instance/semantic pixel-level annotation requirements. By closely examining different kinds of cheaper labels, we introduce a novel multi-objective framework to automatically determine the allocation of different annotations, so as to reach a better segmentation quality with a lower annotation cost. Specifically, we design a Cost-Quality Balanced Network (CQB-Net) to generate the panoptic segmentation map, which distills the crucial relations between various supervisions including panoptic labels, image-level classification labels, bounding boxes, and the semantic coherence information between the foreground and background. Instead of ad-hoc allocation during training, we formulate the optimization of cost-quality trade-off as a Multi-Objective Optimization Problem (MOOP). We model the marginal quality improvement of each annotation and approximate the Pareto-front to enable a label-efficient allocation ratio. Extensive experiments on COCO benchmark show the superiority of our method, e.g. achieving a segmentation quality of 43.4% compared to 43.0% of OCFusion while saving 2.4x annotation cost.

Downloads

Published

2021-05-18

How to Cite

Du, X., Jiang, C., Xu, H., Zhang, G., & Li, Z. (2021). How to Save your Annotation Cost for Panoptic Segmentation?. Proceedings of the AAAI Conference on Artificial Intelligence, 35(2), 1282-1290. https://doi.org/10.1609/aaai.v35i2.16216

Issue

Section

AAAI Technical Track on Computer Vision I