Batch Normalization Is Blind to the First and Second Derivatives of the Loss

Zhanpeng Zhou; Wen Shen; Huixin Chen; Ling Tang; Yuefeng Chen; Quanshi Zhang

doi:10.1609/aaai.v38i18.29978

Batch Normalization Is Blind to the First and Second Derivatives of the Loss

Authors

Zhanpeng Zhou Shanghai Jiao Tong University
Wen Shen Shanghai Jiao Tong University
Huixin Chen Shanghai Jiao Tong University
Ling Tang Shanghai Jiao Tong University
Yuefeng Chen Alibaba Group
Quanshi Zhang Shanghai Jiao Tong University

DOI:

https://doi.org/10.1609/aaai.v38i18.29978

Keywords:

PEAI: Accountability, Interpretability & Explainability, ML: Deep Learning Theory

Abstract

We prove that when we do the Taylor series expansion of the loss function, the BN operation will block the influence of the first-order term and most influence of the second-order term of the loss. We also find that such a problem is caused by the standardization phase of the BN operation. We believe that proving the blocking of certain loss terms provides an analytic perspective for potential detects of a deep model with BN operations, although the blocking problem is not fully equivalent to significant damages in all tasks on benchmark datasets. Experiments show that the BN operation significantly affects feature representations in specific tasks.

AAAI-24 / IAAI-24 / EAAI-24 Proceedings Cover

Downloads

Published

2024-03-24

How to Cite

Zhou, Z., Shen, W., Chen, H., Tang, L., Chen, Y., & Zhang, Q. (2024). Batch Normalization Is Blind to the First and Second Derivatives of the Loss. Proceedings of the AAAI Conference on Artificial Intelligence, 38(18), 20010-20018. https://doi.org/10.1609/aaai.v38i18.29978

Download Citation

Issue

Vol. 38 No. 18: AAAI-24 Technical Tracks 18

Section

AAAI Technical Track on Philosophy and Ethics of AI

Batch Normalization Is Blind to the First and Second Derivatives of the Loss

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Subscription