Adaptive Knowledge Driven Regularization for Deep Neural Networks
Keywords:(Deep) Neural Network Algorithms
AbstractIn many real-world applications, the amount of data available for training is often limited, and thus inductive bias and auxiliary knowledge are much needed for regularizing model training. One popular regularization method is to impose prior distribution assumptions on model parameters, and many recent works also attempt to regularize training by integrating external knowledge into specific neurons. However, existing regularization methods did not take account of the interaction between connected neuron pairs, which is invaluable internal knowledge for adaptive regularization for better representation learning as training progresses. In this paper, we explicitly take into account the interaction between connected neurons, and propose an adaptive internal knowledge driven regularization method, CORR-Reg. The key idea of CORR-Reg is to give a higher significance weight to connections of more correlated neuron pairs. The significance weights adaptively identify more important input neurons for each neuron. Instead of regularizing connection model parameters with a static strength such as weight decay, CORR-Reg imposes weaker regularization strength on more significant connections. As a consequence, neurons attend to more informative input features and thus learn more diversified and discriminative representation. We derive CORR-Reg with Bayesian inference framework and propose a novel optimization algorithm with Lagrange multiplier method and Stochastic Gradient Descent. Extensive evaluations on diverse benchmark datasets and neural network structures show that CORR-Reg achieves significant improvement over state-of-the-art regularization methods.
How to Cite
Luo, Z., Cai, S., Cui, C., Ooi, B. C., & Yang, Y. (2021). Adaptive Knowledge Driven Regularization for Deep Neural Networks. Proceedings of the AAAI Conference on Artificial Intelligence, 35(10), 8810-8818. https://doi.org/10.1609/aaai.v35i10.17067
AAAI Technical Track on Machine Learning III