Amplitude Spectrum Transformation for Open Compound Domain Adaptive Semantic Segmentation


  • Jogendra Nath Kundu Indian Institute of Science
  • Akshay R Kulkarni Indian Institute of Science
  • Suvaansh Bhambri Indian Institute of Science
  • Varun Jampani Google
  • Venkatesh Babu Radhakrishnan Indian Institute of Science



Computer Vision (CV), Machine Learning (ML), Domain(s) Of Application (APP)


Open compound domain adaptation (OCDA) has emerged as a practical adaptation setting which considers a single labeled source domain against a compound of multi-modal unlabeled target data in order to generalize better on novel unseen domains. We hypothesize that an improved disentanglement of domain-related and task-related factors of dense intermediate layer features can greatly aid OCDA. Prior-arts attempt this indirectly by employing adversarial domain discriminators on the spatial CNN output. However, we find that latent features derived from the Fourier-based amplitude spectrum of deep CNN features hold a more tractable mapping with domain discrimination. Motivated by this, we propose a novel feature space Amplitude Spectrum Transformation (AST). During adaptation, we employ the AST auto-encoder for two purposes. First, carefully mined source-target instance pairs undergo a simulation of cross-domain feature stylization (AST-Sim) at a particular layer by altering the AST-latent. Second, AST operating at a later layer is tasked to normalize (AST-Norm) the domain content by fixing its latent to a mean prototype. Our simplified adaptation technique is not only clustering-free but also free from complex adversarial alignment. We achieve leading performance against the prior arts on the OCDA scene segmentation benchmarks.




How to Cite

Kundu, J. N., Kulkarni, A. R., Bhambri, S., Jampani, V., & Radhakrishnan, V. B. (2022). Amplitude Spectrum Transformation for Open Compound Domain Adaptive Semantic Segmentation. Proceedings of the AAAI Conference on Artificial Intelligence, 36(2), 1220-1227.



AAAI Technical Track on Computer Vision II