Reducing Divergence in Batch Normalization for Domain Adaptation

Authors

  • Ellen Yi-Ge Carnegie Mellon University
  • Mingjing Wu Nanyang Technological University
  • Zhenghan Chen Microsoft (China) Co., Ltd

DOI:

https://doi.org/10.1609/aaai.v39i21.34369

Abstract

The widespread adoption of Batch Normalization (BN) in contemporary deep neural architectures has demonstrated significant efficacy, particularly in the domain of Unsupervised Domain Adaptation (UDA) for cross-domain applications. Notwithstanding its success, extant BN variants often conflate source and target domain information within identical channels, potentially compromising transferability due to inter-domain feature misalignment. To address this limitation, we introduce Refined Batch Normalization (RBN), a novel normalization paradigm that leverages estimated shift to quantify discrepancies between estimated population statistics and their expected values. Our pivotal observation reveals that estimated shift can accumulate through BN stacking within the network, potentially degrading target domain performance. We elucidate how RBN mitigates this accumulation, thereby enhancing overall system efficacy. The practical implementation of this technique is realized through the RBNBlock, which supplants conventional BN with RBN in the bottleneck architecture of residual networks. Extensive empirical evaluation across diverse cross-domain benchmarks corroborates the superiority of RBN in augmenting inter-domain transferability. This perspective transcends immediate performance metrics, offering a foundational lens through which subsequent research can more deeply understand and refine the interplay between normalization strategies and domain adaptation.

Downloads

Published

2025-04-11

How to Cite

Yi-Ge, E., Wu, M., & Chen, Z. (2025). Reducing Divergence in Batch Normalization for Domain Adaptation. Proceedings of the AAAI Conference on Artificial Intelligence, 39(21), 22155-22163. https://doi.org/10.1609/aaai.v39i21.34369

Issue

Section

AAAI Technical Track on Machine Learning VII