TY - JOUR
AU - Xu, Zhiqin John
AU - Zhou, Hanxu
PY - 2021/05/18
Y2 - 2022/01/24
TI - Deep Frequency Principle Towards Understanding Why Deeper Learning Is Faster
JF - Proceedings of the AAAI Conference on Artificial Intelligence
JA - AAAI
VL - 35
IS - 12
SE - AAAI Technical Track on Machine Learning V
DO -
UR - https://ojs.aaai.org/index.php/AAAI/article/view/17261
SP - 10541-10550
AB - Understanding the effect of depth in deep learning is a critical problem. In this work, we utilize the Fourier analysis to empirically provide a promising mechanism to understand why feedforward deeper learning is faster. To this end, we separate a deep neural network, trained by normal stochastic gradient descent, into two parts during analysis, i.e., a pre-condition component and a learning component, in which the output of the pre-condition one is the input of the learning one. We use a filtering method to characterize the frequency distribution of a high-dimensional function. Based on experiments of deep networks and real dataset, we propose a deep frequency principle, that is, the effective target function for a deeper hidden layer biases towards lower frequency during the training. Therefore, the learning component effectively learns a lower frequency function if the pre-condition component has more layers. Due to the well-studied frequency principle, i.e., deep neural networks learn lower frequency functions faster, the deep frequency principle provides a reasonable explanation to why deeper learning is faster. We believe these empirical studies would be valuable for future theoretical studies of the effect of depth in deep learning.
ER -