Enhancing Parameter-Free Frank Wolfe with an Extra Subproblem

Bingcong Li; Lingda Wang; Georgios B. Giannakis; Zhizhen Zhao

doi:10.1609/aaai.v35i9.17012

Authors

Bingcong Li University of Minnesota
Lingda Wang University of Illinois at Urbana-Champaign
Georgios B. Giannakis University of Minnesota
Zhizhen Zhao University of Illinois at Urbana-Champaign

DOI:

https://doi.org/10.1609/aaai.v35i9.17012

Keywords:

Optimization

Abstract

Aiming at convex optimization under structural constraints, this work introduces and analyzes a variant of the Frank Wolfe (FW) algorithm termed ExtraFW. The distinct feature of ExtraFW is the pair of gradients leveraged per iteration, thanks to which the decision variable is updated in a prediction-correction (PC) format. Relying on no problem dependent parameters in the step sizes, the convergence rate of ExtraFW for general convex problems is shown to be ${\cal O}(\frac{1}{k})$, which is optimal in the sense of matching the lower bound on the number of solved FW subproblems. However, the merit of ExtraFW is its faster rate ${\cal O}\big(\frac{1}{k^2} \big)$ on a class of machine learning problems. Compared with other parameter-free FW variants that have faster rates on the same problems, ExtraFW has improved rates and fine-grained analysis thanks to its PC update. Numerical tests on binary classification with different sparsity-promoting constraints demonstrate that the empirical performance of ExtraFW is significantly better than FW, and even faster than Nesterov's accelerated gradient on certain datasets. For matrix completion, ExtraFW enjoys smaller optimality gap, and lower rank than FW.

Enhancing Parameter-Free Frank Wolfe with an Extra Subproblem

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Subscription