Coordinate Descent Methods for DC Minimization: Optimality Conditions and Global Convergence
DOI:
https://doi.org/10.1609/aaai.v37i9.26307Keywords:
ML: Optimization, CSO: Mixed Discrete/Continuous OptimizationAbstract
Difference-of-Convex (DC) minimization, referring to the problem of minimizing the difference of two convex functions, has been found rich applications in statistical learning and studied extensively for decades. However, existing methods are primarily based on multi-stage convex relaxation, only leading to weak optimality of critical points. This paper proposes a coordinate descent method for minimizing a class of DC functions based on sequential nonconvex approximation. Our approach iteratively solves a nonconvex one-dimensional subproblem globally, and it is guaranteed to converge to a coordinate-wise stationary point. We prove that this new optimality condition is always stronger than the standard critical point condition and directional point condition under a mildlocally bounded nonconvexity assumption. For comparisons, we also include a naive variant of coordinate descent methods based on sequential convex approximation in our study. When the objective function satisfies a globally bounded nonconvexity assumption and Luo-Tseng error bound assumption, coordinate descent methods achieve Q-linear convergence rate. Also, for many applications of interest, we show that the nonconvex one-dimensional subproblem can be computed exactly and efficiently using a breakpoint searching method. Finally, we have conducted extensive experiments on several statistical learning tasks to show the superiority of our approach.Downloads
Published
2023-06-26
How to Cite
Yuan, G. (2023). Coordinate Descent Methods for DC Minimization: Optimality Conditions and Global Convergence. Proceedings of the AAAI Conference on Artificial Intelligence, 37(9), 11034-11042. https://doi.org/10.1609/aaai.v37i9.26307
Issue
Section
AAAI Technical Track on Machine Learning IV