Optimal Column Subset Selection by A-Star Search


  • Hiromasa Arai The University of Texas at Dallas
  • Crystal Maung The University of Texas at Dallas
  • Haim Schweitzer University of Texas at Dallas




A*, Unsupervised feature selection


Approximating a matrix by a small subset of its columns is a known problem in numerical linear algebra. Algorithms that address this problem have been used in areas which include, among others, sparse approximation, unsupervised feature selection, data mining, and knowledge representation. Such algorithms were investigated since the 1960's, with recent results that use randomization. The problem is believed to be NP-Hard, and to the best of our knowledge there are no previously published algorithms aimed at computing optimal solutions. We show how to model the problem as a graph search, and propose a heuristic based on eigenvalues of related matrices. Applying the A* search strategy with this heuristic is guaranteed to find the optimal solution. Experimental results on common datasets show that the proposed algorithm can effectively select columns from moderate size matrices, typically improving by orders of magnitude the run time of exhaustive search. We also show how to combine the proposed algorithm with other non-optimal (but much faster) algorithms in a ``two stage'' framework, which is guaranteed to improve the accuracy of the other algorithms.




How to Cite

Arai, H., Maung, C., & Schweitzer, H. (2015). Optimal Column Subset Selection by A-Star Search. Proceedings of the AAAI Conference on Artificial Intelligence, 29(1). https://doi.org/10.1609/aaai.v29i1.9353



AAAI Technical Track: Heuristic Search and Optimization