On Data-Dependent Random Features for Improved Generalization in Supervised Learning

Shahin Shahrampour; Ahmad Beirami; Vahid Tarokh

doi:10.1609/aaai.v32i1.11697

Authors

Shahin Shahrampour Harvard University
Ahmad Beirami Harvard University
Vahid Tarokh Harvard University

DOI:

https://doi.org/10.1609/aaai.v32i1.11697

Keywords:

Kernel methods, random features

Abstract

The randomized-feature approach has been successfully employed in large-scale kernel approximation and supervised learning. The distribution from which the random features are drawn impacts the number of features required to efficiently perform a learning task. Recently, it has been shown that employing data-dependent randomization improves the performance in terms of the required number of random features. In this paper, we are concerned with the randomized-feature approach in supervised learning for good generalizability. We propose the Energy-based Exploration of Random Features (EERF) algorithm based on a data-dependent score function that explores the set of possible features and exploits the promising regions. We prove that the proposed score function with high probability recovers the spectrum of the best fit within the model class. Our empirical results on several benchmark datasets further verify that our method requires smaller number of random features to achieve a certain generalization error compared to the state-of-the-art while introducing negligible pre-processing overhead. EERF can be implemented in a few lines of code and requires no additional tuning parameters.

On Data-Dependent Random Features for Improved Generalization in Supervised Learning

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription