Consensus Guided Unsupervised Feature Selection

Authors

  • Hongfu Liu Northeastern University
  • Ming Shao Northeastern University
  • Yun Fu Northeastern University

DOI:

https://doi.org/10.1609/aaai.v30i1.10221

Keywords:

Feature Selection, Consensus Clustering

Abstract

Feature selection has been widely recognized as one of the key problems in data mining and machine learning community, especially for high-dimensional data with redundant information, partial noises and outliers. Recently, unsupervised feature selection attracts substantial research attentions since data acquisition is rather cheap today but labeling work is still expensive and time consuming. This is specifically useful for effective feature selection of clustering tasks. Recent works using sparse projection with pre-learned pseudo labels achieve appealing results; however, they generate pseudo labels with all features so that noisy and ineffective features degrade the cluster structure and further harm the performance of feature selection; besides, these methods suffer from complex composition of multiple constraints and computational inefficiency, e.g., eigen-decomposition. Differently, in this work we introduce consensus clustering for pseudo labeling, which gets rid of expensive eigen-decomposition and provides better clustering accuracy with high robustness. In addition, complex constraints such as non-negative are removed due to the crisp indicators of consensus clustering. Specifically, we propose one efficient formulation for our unsupervised feature selection by using the utility function and provide theoretical analysis on optimization rules and model convergence. Extensive experiments on several popular data sets demonstrate that our methods are superior to the most recent state-of-the-art works in terms of NMI.

Downloads

Published

2016-02-21

How to Cite

Liu, H., Shao, M., & Fu, Y. (2016). Consensus Guided Unsupervised Feature Selection. Proceedings of the AAAI Conference on Artificial Intelligence, 30(1). https://doi.org/10.1609/aaai.v30i1.10221

Issue

Section

Technical Papers: Machine Learning Methods