RLEKF: An Optimizer for Deep Potential with Ab Initio Accuracy

Authors

  • Siyu Hu State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences University of Chinese Academy of Sciences, Beijing, China
  • Wentao Zhang School of Advanced Materials, Shenzhen Graduate School, Peking University
  • Qiuchen Sha State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences University of Chinese Academy of Sciences, Beijing, China
  • Feng Pan School of Advanced Materials, Shenzhen Graduate School, Peking University
  • Lin-Wang Wang Institute of Semiconductors, Chinese Academy of Sciences, Beijing, China
  • Weile Jia State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences
  • Guangming Tan State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences
  • Tong Zhao State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences

DOI:

https://doi.org/10.1609/aaai.v37i7.25957

Keywords:

ML: Applications, ML: Deep Learning Theory, ML: Deep Neural Architectures, ML: Deep Neural Network Algorithms, ML: Learning Theory, ML: Optimization, ML: Probabilistic Methods, ML: Scalability of ML Systems

Abstract

It is imperative to accelerate the training of neural network force field such as Deep Potential, which usually requires thousands of images based on first-principles calculation and a couple of days to generate an accurate potential energy surface. To this end, we propose a novel optimizer named reorganized layer extended Kalman filtering (RLEKF), an optimized version of global extended Kalman filtering (GEKF) with a strategy of splitting big and gathering small layers to overcome the O(N^2) computational cost of GEKF. This strategy provides an approximation of the dense weights error covariance matrix with a sparse diagonal block matrix for GEKF. We implement both RLEKF and the baseline Adam in our alphaDynamics package and numerical experiments are performed on 13 unbiased datasets. Overall, RLEKF converges faster with slightly better accuracy. For example, a test on a typical system, bulk copper, shows that RLEKF converges faster by both the number of training epochs (x11.67) and wall-clock time (x1.19). Besides, we theoretically prove that the updates of weights converge and thus are against the gradient exploding problem. Experimental results verify that RLEKF is not sensitive to the initialization of weights. The RLEKF sheds light on other AI-for-science applications where training a large neural network (with tons of thousands parameters) is a bottleneck.

Downloads

Published

2023-06-26

How to Cite

Hu, S., Zhang, W., Sha, Q., Pan, F., Wang, L.-W., Jia, W., Tan, G., & Zhao, T. (2023). RLEKF: An Optimizer for Deep Potential with Ab Initio Accuracy. Proceedings of the AAAI Conference on Artificial Intelligence, 37(7), 7910-7918. https://doi.org/10.1609/aaai.v37i7.25957

Issue

Section

AAAI Technical Track on Machine Learning II