DiFA: Differentiable Feature Acquisition

Authors

  • Aritra Ghosh University of Massachusetts Amherst
  • Andrew Lan University of Massachusetts Amherst

DOI:

https://doi.org/10.1609/aaai.v37i6.25934

Keywords:

ML: Dimensionality Reduction/Feature Selection, ML: Active Learning, ML: Classification and Regression, ML: Deep Neural Network Algorithms, ML: Other Foundations of Machine Learning

Abstract

Feature acquisition in predictive modeling is an important task in many practical applications. For example, in patient health prediction, we do not fully observe their personal features and need to dynamically select features to acquire. Our goal is to acquire a small subset of features that maximize prediction performance. Recently, some works reformulated feature acquisition as a Markov decision process and applied reinforcement learning (RL) algorithms, where the reward reflects both prediction performance and feature acquisition cost. However, RL algorithms only use zeroth-order information on the reward, which leads to slow empirical convergence, especially when there are many actions (number of features) to consider. For predictive modeling, it is possible to use first-order information on the reward, i.e., gradients, since we are often given an already collected dataset. Therefore, we propose differentiable feature acquisition (DiFA), which uses a differentiable representation of the feature selection policy to enable gradients to flow from the prediction loss to the policy parameters. We conduct extensive experiments on various real-world datasets and show that DiFA significantly outperforms existing feature acquisition methods when the number of features is large.

Downloads

Published

2023-06-26

How to Cite

Ghosh, A., & Lan, A. (2023). DiFA: Differentiable Feature Acquisition. Proceedings of the AAAI Conference on Artificial Intelligence, 37(6), 7705-7713. https://doi.org/10.1609/aaai.v37i6.25934

Issue

Section

AAAI Technical Track on Machine Learning I