HyperAdam: A Learnable Task-Adaptive Adam for Network Training

Authors

  • Shipeng Wang Xi'an Jiaotong University
  • Jian Sun Xi'an Jiaotong University
  • Zongben Xu Xi’an Jiaotong University

DOI:

https://doi.org/10.1609/aaai.v33i01.33015297

Abstract

Deep neural networks are traditionally trained using humandesigned stochastic optimization algorithms, such as SGD and Adam. Recently, the approach of learning to optimize network parameters has emerged as a promising research topic. However, these learned black-box optimizers sometimes do not fully utilize the experience in human-designed optimizers, therefore have limitation in generalization ability. In this paper, a new optimizer, dubbed as HyperAdam, is proposed that combines the idea of “learning to optimize” and traditional Adam optimizer. Given a network for training, its parameter update in each iteration generated by HyperAdam is an adaptive combination of multiple updates generated by Adam with varying decay rates . The combination weights and decay rates in HyperAdam are adaptively learned depending on the task. HyperAdam is modeled as a recurrent neural network with AdamCell, WeightCell and StateCell. It is justified to be state-of-the-art for various network training, such as multilayer perceptron, CNN and LSTM.

Downloads

Published

2019-07-17

How to Cite

Wang, S., Sun, J., & Xu, Z. (2019). HyperAdam: A Learnable Task-Adaptive Adam for Network Training. Proceedings of the AAAI Conference on Artificial Intelligence, 33(01), 5297-5304. https://doi.org/10.1609/aaai.v33i01.33015297

Issue

Section

AAAI Technical Track: Machine Learning