Margin-Based Feature Selection in Incomplete Data

Qiang Lou; Zoran Obradovic

doi:10.1609/aaai.v26i1.8299

Authors

Qiang Lou Temple University
Zoran Obradovic Temple University

DOI:

https://doi.org/10.1609/aaai.v26i1.8299

Keywords:

feature selection, incomplete data

Abstract

This study considers the problem of feature selection in incomplete data. The intuitive approach is to first impute the missing values, and then apply a standard feature selection method to select relevant features. In this study, we show how to perform feature selection directly, without imputing missing values. We define the objective function of the uncertainty margin-based feature selection method to maximize each instance’s uncertainty margin in its own relevant subspace. In optimization, we take into account the uncertainty of each instance due to the missing values. The experimental results on synthetic and 6 benchmark data sets with few missing values (less than 25%) provide evidence that our method can select the same accurate features as the alternative methods which apply an imputation method first. However, when there is a large fraction of missing values (more than 25%) in data, our feature selection method outperforms the alternatives, which impute missing values first.

Margin-Based Feature Selection in Incomplete Data

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information