MaskViM: Domain Generalized Semantic Segmentation with State Space Models

Jiahao Li; Yang Lu; Yuan Xie; Yanyun Qu

doi:10.1609/aaai.v39i5.32502

Authors

Jiahao Li School of Informatics, Xiamen University
Yang Lu School of Informatics, Xiamen University Institute of Artificial Intelligence, Xiamen University Key Laboratory of Multimedia Trusted Perception and Efficient Computing,Ministry of Education of China, Xiamen University
Yuan Xie School of Computer Science and Technology, East China Normal University Chongqing Institute of East China Normal University
Yanyun Qu School of Informatics, Xiamen University Institute of Artificial Intelligence, Xiamen University Key Laboratory of Multimedia Trusted Perception and Efficient Computing,Ministry of Education of China, Xiamen University

DOI:

https://doi.org/10.1609/aaai.v39i5.32502

Abstract

Domain Generalized Semantic Segmentation (DGSS) aims to utilize segmentation model training on known source domains to make predictions on unknown target domains. Currently, there are two network architectures: one based on Convolutional Neural Networks (CNNs) and the other based on Visual Transformers (ViTs). However, both CNN-based and ViT-based DGSS methods face challenges: the former lacks a global receptive field, while the latter requires more computational demands. Drawing inspiration from State Space Models (SSMs), which not only possess a global receptive field but also maintain linear complexity, we propose SSM-based method for achieving DGSS. In this work, we first elucidate why does mask make sense in SSM-based DGSS and propose our mask learning mechanism. Leveraging this mechanism, we present our Mask Vision Mamba network (MaskViM), a model for SSM-based DGSS, and design our mask loss to optimize MaskViM. Our method achieves superior performance on four diverse DGSS setting, which demonstrates the effectiveness of our method.

MaskViM: Domain Generalized Semantic Segmentation with State Space Models

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information