Generalization Error Bounds for Optimization Algorithms via Stability

Authors

  • Qi Meng Peking University
  • Yue Wang Beijing Jiaotong University
  • Wei Chen Microsoft Research
  • Taifeng Wang Microsoft Research
  • Zhi-Ming Ma Chinese Academy of Mathematics and Systems Science
  • Tie-Yan Liu Microsoft Research

DOI:

https://doi.org/10.1609/aaai.v31i1.10919

Keywords:

generalization, stability, optimization, stochastic gradient descent, variance reduction

Abstract

Many machine learning tasks can be formulated as Regularized Empirical Risk Minimization (R-ERM), and solved by optimization algorithms such as gradient descent (GD), stochastic gradient descent (SGD), and stochastic variance reduction (SVRG). Conventional analysis on these optimization algorithms focuses on their convergence rates during the training process, however, people in the machine learning community may care more about the generalization performance of the learned model on unseen test data. In this paper, we investigate on this issue, by using stability as a tool. In particular, we decompose the generalization error for R-ERM, and derive its upper bound for both convex and nonconvex cases. In convex cases, we prove that the generalization error can be bounded by the convergence rate of the optimization algorithm and the stability of the R-ERM process, both in expectation (in the order of

Downloads

Published

2017-02-13

How to Cite

Meng, Q., Wang, Y., Chen, W., Wang, T., Ma, Z.-M., & Liu, T.-Y. (2017). Generalization Error Bounds for Optimization Algorithms via Stability. Proceedings of the AAAI Conference on Artificial Intelligence, 31(1). https://doi.org/10.1609/aaai.v31i1.10919