Adaptive Normalized Risk-Averting Training for Deep Neural Networks

Authors

  • Zhiguang Wang University of Maryland Baltimore County
  • Tim Oates University of Maryland Baltimore County
  • James Lo University of Maryland Baltimore County

DOI:

https://doi.org/10.1609/aaai.v30i1.10189

Keywords:

risk averting error, convex optimization, deep neural networks

Abstract

This paper proposes a set of new error criteria and a learning approach, called Adaptive Normalized Risk-Averting Training (ANRAT) to attack the non-convex optimization problem in training deep neural networks without pretraining. Theoretically, we demonstrate its effectiveness based on the expansion of the convexity region. By analyzing the gradient on the convexity index $\lambda$, we explain the reason why our learning method using gradient descent works. In practice, we show how this training method is successfully applied for improved training of deep neural networks to solve visual recognition tasks on the MNIST and CIFAR-10 datasets. Using simple experimental settings without pretraining and other tricks, we obtain results comparable or superior to those reported in recent literature on the same tasks using standard ConvNets + MSE/cross entropy. Performance on deep/shallow multilayer perceptron and Denoised Auto-encoder is also explored. ANRAT can be combined with other quasi-Newton training methods, innovative network variants, regularization techniques and other common tricks in DNNs. Other than unsupervised pretraining, it provides a new perspective to address the non-convex optimization strategy in training DNNs.

Downloads

Published

2016-03-02

How to Cite

Wang, Z., Oates, T., & Lo, J. (2016). Adaptive Normalized Risk-Averting Training for Deep Neural Networks. Proceedings of the AAAI Conference on Artificial Intelligence, 30(1). https://doi.org/10.1609/aaai.v30i1.10189

Issue

Section

Technical Papers: Machine Learning Methods