Practical Markov Boundary Learning without Strong Assumptions

Authors

  • Xingyu Wu University of Science and Technology of China
  • Bingbing Jiang Hangzhou Normal University
  • Tianhao Wu University of Science and Technology of China
  • Huanhuan Chen School of Computer Science and Technology, University of Science and Technology of China

DOI:

https://doi.org/10.1609/aaai.v37i9.26236

Keywords:

ML: Dimensionality Reduction/Feature Selection, ML: Causal Learning, RU: Causality, RU: Bayesian Networks

Abstract

Theoretically, the Markov boundary (MB) is the optimal solution for feature selection. However, existing MB learning algorithms often fail to identify some critical features in real-world feature selection tasks, mainly because the strict assumptions of existing algorithms, on either data distribution, variable types, or correctness of criteria, cannot be satisfied in application scenarios. This paper takes further steps toward opening the door to real-world applications for MB. We contribute in particular to a practical MB learning strategy, which can maintain feasibility and effectiveness in real-world data where variables can be numerical or categorical with linear or nonlinear, pairwise or multivariate relationships. Specifically, the equivalence between MB and the minimal conditional covariance operator (CCO) is investigated, which inspires us to design the objective function based on the predictability evaluation of the mapping variables in a reproducing kernel Hilbert space. Based on this, a kernel MB learning algorithm is proposed, where nonlinear multivariate dependence could be considered without extra requirements on data distribution and variable types. Extensive experiments demonstrate the efficacy of these contributions.

Downloads

Published

2023-06-26

How to Cite

Wu, X., Jiang, B., Wu, T., & Chen, H. (2023). Practical Markov Boundary Learning without Strong Assumptions. Proceedings of the AAAI Conference on Artificial Intelligence, 37(9), 10388–10398. https://doi.org/10.1609/aaai.v37i9.26236

Issue

Section

AAAI Technical Track on Machine Learning IV