SasWOT: Real-Time Semantic Segmentation Architecture Search WithOut Training

Authors

  • Chendi Zhu State Key Laboratory for Novel Software Technology, Nanjing University
  • Lujun Li The Hong Kong University of Science and Technology
  • Yuli Wu Institute of Imaging and Computer Vision, RWTH Aachen University, Aachen, Germany
  • Zhengxing Sun State Key Laboratory for Novel Software Technology, Nanjing University

DOI:

https://doi.org/10.1609/aaai.v38i7.28606

Keywords:

CV: Applications, CV: Learning & Optimization for CV, CV: Representation Learning for Vision, CV: Scene Analysis & Understanding, CV: Segmentation, ML: Applications, ML: Auto ML and Hyperparameter Tuning

Abstract

In this paper, we present SasWOT, the first training-free Semantic segmentation Architecture Search (SAS) framework via an auto-discovery proxy. Semantic segmentation is widely used in many real-time applications. For fast inference and memory efficiency, Previous SAS seeks the optimal segmenter by differentiable or RL Search. However, the significant computational costs of these training-based SAS limit their practical usage. To improve the search efficiency, we explore the training-free route but empirically observe that the existing zero-cost proxies designed on the classification task are sub-optimal on the segmentation benchmark. To address this challenge, we develop a customized proxy search framework for SAS tasks to augment its predictive capabilities. Specifically, we design the proxy search space based on the some observations: (1) different inputs of segmenter statistics can be well combined; (2) some basic operators can effectively improve the correlation. Thus, we build computational graphs with multiple statistics as inputs and different advanced basis arithmetic as the primary operations to represent candidate proxies. Then, we employ an evolutionary algorithm to crossover and mutate the superior candidates in the population based on correlation evaluation. Finally, based on the searched proxy, we perform the segmenter search without candidate training. In this way, SasWOT not only enables automated proxy optimization for SAS tasks but also achieves significant search acceleration before the retrain stage. Extensive experiments on Cityscapes and CamVid datasets demonstrate that SasWOT achieves superior trade-off between accuracy and speed over several state-of-the-art techniques. More remarkably, on Cityscapes dataset, SasWOT achieves the performance of 71.3% mIoU with the speed of 162 FPS.

Published

2024-03-24

How to Cite

Zhu, C., Li, L., Wu, Y., & Sun, Z. (2024). SasWOT: Real-Time Semantic Segmentation Architecture Search WithOut Training. Proceedings of the AAAI Conference on Artificial Intelligence, 38(7), 7722-7730. https://doi.org/10.1609/aaai.v38i7.28606

Issue

Section

AAAI Technical Track on Computer Vision VI