Ada-Segment: Automated Multi-loss Adaptation for Panoptic Segmentation

Authors

  • Gengwei Zhang Sun Yat-sen University
  • Yiming Gao Sun Yat-sen University
  • Hang Xu Huawei Noah's Ark Lab
  • Hao Zhang Shanghai Jiao Tong University
  • Zhenguo Li Huawei Noah's Ark Lab
  • Xiaodan Liang Sun Yat-sen University

Keywords:

Segmentation, Hyperparameter Tuning / Algorithm Configuration

Abstract

Panoptic segmentation that unifies instance segmentation and semantic segmentation has recently attracted increasing attention. While most existing methods focus on designing novel architectures, we steer toward a different perspective: performing automated multi-loss adaptation (named Ada-Segment) on the fly to flexibly adjust multiple training losses over the course of training using a controller trained to capture the learning dynamics. This offers a few advantages: it bypasses manual tuning of the sensitive loss combination, a decisive factor for panoptic segmentation; allows to explicitly model the learning dynamics, and reconcile the learning of multiple objectives (up to ten in our experiments); with an end-to-end architecture, it generalizes to different datasets without the need of re-tuning hyperparameters or re-adjusting the training process laboriously. Our Ada-Segment brings 2.7% panoptic quality (PQ) improvement on COCO val split from the vanilla baseline, achieving the state-of-the-art 48.5% PQ on COCO test-dev split and 32.9% PQ on ADE20K dataset. The extensive ablation studies reveal the ever-changing dynamics throughout the training process, necessitating the incorporation of an automated and adaptive learning strategy as presented in this paper.

Downloads

Published

2021-05-18

How to Cite

Zhang, G., Gao, Y., Xu, H., Zhang, H., Li, Z., & Liang, X. (2021). Ada-Segment: Automated Multi-loss Adaptation for Panoptic Segmentation. Proceedings of the AAAI Conference on Artificial Intelligence, 35(4), 3333-3341. Retrieved from https://ojs.aaai.org/index.php/AAAI/article/view/16445

Issue

Section

AAAI Technical Track on Computer Vision III