TIKP: Text-to-Image Knowledge Preservation for Continual Semantic Segmentation

Authors

  • Zhidong Yu School of Computer Science and Technology, University of Science and Technology of China, Hefei 230026, China
  • Wei Yang School of Computer Science and Technology, University of Science and Technology of China, Hefei 230026, China Hefei National Laboratory, Hefei 230088, China
  • Xike Xie School of Computer Science and Technology, University of Science and Technology of China, Hefei 230026, China Suzhou Institute for Advanced Research, University of Science and Technology of China, Suzhou 215123, China
  • Zhenbo Shi School of Computer Science and Technology, University of Science and Technology of China, Hefei 230026, China Suzhou Institute for Advanced Research, University of Science and Technology of China, Suzhou 215123, China

DOI:

https://doi.org/10.1609/aaai.v38i15.29598

Keywords:

ML: Life-Long and Continual Learning, CV: Scene Analysis & Understanding, CV: Segmentation

Abstract

Continual Semantic Segmentation (CSS) is an emerging trend, where catastrophic forgetting has been a perplexing problem. In this paper, we propose a Text-to-Image Knowledge Preservation (TIKP) framework to address this issue. TIKP applies Text-to-Image techniques to CSS by automatically generating prompts and content adaptation. It extracts associations between the labels of seen data and constructs text-level prompts based on these associations, which are preserved and maintained at each incremental step. During training, these prompts generate correlated images to mitigate the catastrophic forgetting. Particularly, as the generated images may have different distributions from the original data, TIKP transfers the knowledge by a content adaption loss, which determines the role played by the generated images in incremental training based on the similarity. In addition, for the classifier, we use the previous model from a different perspective: misclassifying new classes into old objects instead of the background. We propose a knowledge distillation loss based on wrong labels, enabling us to attribute varying weights to individual objects during the distillation process. Extensive experiments conducted in the same setting show that TIKP outperforms state-of-the-art methods by a large margin on benchmark datasets.

Published

2024-03-24

How to Cite

Yu, Z., Yang, W., Xie, X., & Shi, Z. (2024). TIKP: Text-to-Image Knowledge Preservation for Continual Semantic Segmentation. Proceedings of the AAAI Conference on Artificial Intelligence, 38(15), 16596-16604. https://doi.org/10.1609/aaai.v38i15.29598

Issue

Section

AAAI Technical Track on Machine Learning VI