Enhancing Parameter-Free Frank Wolfe with an Extra Subproblem

Authors

  • Bingcong Li University of Minnesota
  • Lingda Wang University of Illinois at Urbana-Champaign
  • Georgios B. Giannakis University of Minnesota
  • Zhizhen Zhao University of Illinois at Urbana-Champaign

DOI:

https://doi.org/10.1609/aaai.v35i9.17012

Keywords:

Optimization

Abstract

Aiming at convex optimization under structural constraints, this work introduces and analyzes a variant of the Frank Wolfe (FW) algorithm termed ExtraFW. The distinct feature of ExtraFW is the pair of gradients leveraged per iteration, thanks to which the decision variable is updated in a prediction-correction (PC) format. Relying on no problem dependent parameters in the step sizes, the convergence rate of ExtraFW for general convex problems is shown to be ${\cal O}(\frac{1}{k})$, which is optimal in the sense of matching the lower bound on the number of solved FW subproblems. However, the merit of ExtraFW is its faster rate ${\cal O}\big(\frac{1}{k^2} \big)$ on a class of machine learning problems. Compared with other parameter-free FW variants that have faster rates on the same problems, ExtraFW has improved rates and fine-grained analysis thanks to its PC update. Numerical tests on binary classification with different sparsity-promoting constraints demonstrate that the empirical performance of ExtraFW is significantly better than FW, and even faster than Nesterov's accelerated gradient on certain datasets. For matrix completion, ExtraFW enjoys smaller optimality gap, and lower rank than FW.

Downloads

Published

2021-05-18

How to Cite

Li, B., Wang, L., Giannakis, G. B., & Zhao, Z. (2021). Enhancing Parameter-Free Frank Wolfe with an Extra Subproblem. Proceedings of the AAAI Conference on Artificial Intelligence, 35(9), 8324-8331. https://doi.org/10.1609/aaai.v35i9.17012

Issue

Section

AAAI Technical Track on Machine Learning II