Coreset Stochastic Variance-Reduced Gradient with Application to Optimal Margin Distribution Machine
A major problem for kernel-based predictors is the prohibitive computational complexity, which limits their application in large-scale datasets. Coreset, an approximation method which tries to cover the given examples with a small set of points, can be used to remain the prominent information and accelerate the kernel method. In this paper, we provide perhaps the first coreset-based kernel-accelerating optimization method that has a linear convergence rate, which is much faster than existing approaches. Our method can be used to train kernel SVM-style problems and obtain sparse solutions efficiently. Specifically, the method uses SVRG as the framework, and utilizes the core points to approximate the gradients, so it can significantly reduce the complexity of the kernel method. Furthermore, we apply the method to train ODM, a kernel machine enjoying better statistical property than SVM, so that we can reduce the risk of compromising the performance while encouraging the sparsity. We conduct extensive experiments on several large-scale datasets and the results verify that our method outperforms the state-of-the-art coreset approximation method in both efficiency and generalization, while simultaneously achieving significant speed-up compared to non-approximation baselines.