Addressing Domain Gap via Content Invariant Representation for Semantic Segmentation


  • Li Gao Wuhan University
  • Lefei Zhang Wuhan University
  • Qian Zhang Horizon Robotics


Transfer/Adaptation/Multi-task/Meta/Automated Learning, Segmentation


The problem of unsupervised domain adaptation in semantic segmentation is a major challenge for numerous computer vision tasks because acquiring pixel-level labels is time-consuming with expensive human labor. A large gap exists among data distributions in different domains, which will cause severe performance loss when a model trained with synthetic data is generalized to real data. Hence, we propose a novel domain adaptation approach, called Content Invariant Representation Network, to narrow the domain gap between the source (S) and target (T) domains. The previous works developed a network to directly transfer the knowledge from the S to T. On the contrary, the proposed method aims to progressively reduce the gap between S and T on the basis of a Content Invariant Representation (CIR). CIR is an intermediate domain (I) sharing invariant content with S and having similar data distribution to T. Then, an Ancillary Classifier Module (ACM) is designed to focus on pixel-level details and generate attention-aware results. ACM adaptively assigns different weights to pixels according to their domain offsets, thereby reducing local domain gaps. The global domain gap between CIR and T is also narrowed by enforcing local alignments. Last, we perform self-supervised training in the pseudo-labeled target domain to further fit the distribution of the real data. Comprehensive experiments on two domain adaptation tasks, that is, GTAV → Cityscapes and SYNTHIA → Cityscapes, clearly demonstrate the superiority of our method compared with state-of-the-art methods.




How to Cite

Gao, L., Zhang, L., & Zhang, Q. (2021). Addressing Domain Gap via Content Invariant Representation for Semantic Segmentation. Proceedings of the AAAI Conference on Artificial Intelligence, 35(9), 7528-7536. Retrieved from



AAAI Technical Track on Machine Learning II