Adaptive Normalized Risk-Averting Training for Deep Neural Networks

Zhiguang Wang; Tim Oates; James Lo

doi:10.1609/aaai.v30i1.10189

Authors

Zhiguang Wang University of Maryland Baltimore County
Tim Oates University of Maryland Baltimore County
James Lo University of Maryland Baltimore County

DOI:

https://doi.org/10.1609/aaai.v30i1.10189

Keywords:

risk averting error, convex optimization, deep neural networks

Abstract

This paper proposes a set of new error criteria and a learning approach, called Adaptive Normalized Risk-Averting Training (ANRAT) to attack the non-convex optimization problem in training deep neural networks without pretraining. Theoretically, we demonstrate its effectiveness based on the expansion of the convexity region. By analyzing the gradient on the convexity index $\lambda$, we explain the reason why our learning method using gradient descent works. In practice, we show how this training method is successfully applied for improved training of deep neural networks to solve visual recognition tasks on the MNIST and CIFAR-10 datasets. Using simple experimental settings without pretraining and other tricks, we obtain results comparable or superior to those reported in recent literature on the same tasks using standard ConvNets + MSE/cross entropy. Performance on deep/shallow multilayer perceptron and Denoised Auto-encoder is also explored. ANRAT can be combined with other quasi-Newton training methods, innovative network variants, regularization techniques and other common tricks in DNNs. Other than unsupervised pretraining, it provides a new perspective to address the non-convex optimization strategy in training DNNs.

Adaptive Normalized Risk-Averting Training for Deep Neural Networks

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information