HyperAdam: A Learnable Task-Adaptive Adam for Network Training

Shipeng Wang; Jian Sun; Zongben Xu

doi:10.1609/aaai.v33i01.33015297

Authors

Shipeng Wang Xi'an Jiaotong University
Jian Sun Xi'an Jiaotong University
Zongben Xu Xi’an Jiaotong University

DOI:

https://doi.org/10.1609/aaai.v33i01.33015297

Abstract

Deep neural networks are traditionally trained using humandesigned stochastic optimization algorithms, such as SGD and Adam. Recently, the approach of learning to optimize network parameters has emerged as a promising research topic. However, these learned black-box optimizers sometimes do not fully utilize the experience in human-designed optimizers, therefore have limitation in generalization ability. In this paper, a new optimizer, dubbed as HyperAdam, is proposed that combines the idea of “learning to optimize” and traditional Adam optimizer. Given a network for training, its parameter update in each iteration generated by HyperAdam is an adaptive combination of multiple updates generated by Adam with varying decay rates . The combination weights and decay rates in HyperAdam are adaptively learned depending on the task. HyperAdam is modeled as a recurrent neural network with AdamCell, WeightCell and StateCell. It is justified to be state-of-the-art for various network training, such as multilayer perceptron, CNN and LSTM.

HyperAdam: A Learnable Task-Adaptive Adam for Network Training

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Subscription