DiFA: Differentiable Feature Acquisition

Aritra Ghosh; Andrew Lan

doi:10.1609/aaai.v37i6.25934

Authors

Aritra Ghosh University of Massachusetts Amherst
Andrew Lan University of Massachusetts Amherst

DOI:

https://doi.org/10.1609/aaai.v37i6.25934

Keywords:

ML: Dimensionality Reduction/Feature Selection, ML: Active Learning, ML: Classification and Regression, ML: Deep Neural Network Algorithms, ML: Other Foundations of Machine Learning

Abstract

Feature acquisition in predictive modeling is an important task in many practical applications. For example, in patient health prediction, we do not fully observe their personal features and need to dynamically select features to acquire. Our goal is to acquire a small subset of features that maximize prediction performance. Recently, some works reformulated feature acquisition as a Markov decision process and applied reinforcement learning (RL) algorithms, where the reward reflects both prediction performance and feature acquisition cost. However, RL algorithms only use zeroth-order information on the reward, which leads to slow empirical convergence, especially when there are many actions (number of features) to consider. For predictive modeling, it is possible to use first-order information on the reward, i.e., gradients, since we are often given an already collected dataset. Therefore, we propose differentiable feature acquisition (DiFA), which uses a differentiable representation of the feature selection policy to enable gradients to flow from the prediction loss to the policy parameters. We conduct extensive experiments on various real-world datasets and show that DiFA significantly outperforms existing feature acquisition methods when the number of features is large.

DiFA: Differentiable Feature Acquisition

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Subscription