Nearest-Neighbor Sampling Based Conditional Independence Testing

Authors

  • Shuai Li School of Statistics, KLATASDS-MOE, East China Normal University, Shanghai, China
  • Ziqi Chen School of Statistics, KLATASDS-MOE, East China Normal University, Shanghai, China
  • Hongtu Zhu Departments of Biostatistics, Statistics, Computer Science, and Genetics, The University of North Carolina at Chapel Hill, Chapel Hill, USA
  • Christina Dan Wang Business Division, New York University Shanghai, Shanghai, China
  • Wang Wen School of Mathematics and Statistics, Central South University, Changsha, China

DOI:

https://doi.org/10.1609/aaai.v37i7.26039

Keywords:

ML: Causal Learning, ML: Classification and Regression, ML: Deep Generative Models & Autoencoders

Abstract

The conditional randomization test (CRT) was recently proposed to test whether two random variables X and Y are conditionally independent given random variables Z. The CRT assumes that the conditional distribution of X given Z is known under the null hypothesis and then it is compared to the distribution of the observed samples of the original data. The aim of this paper is to develop a novel alternative of CRT by using nearest-neighbor sampling without assuming the exact form of the distribution of X given Z. Specifically, we utilize the computationally efficient 1-nearest-neighbor to approximate the conditional distribution that encodes the null hypothesis. Then, theoretically, we show that the distribution of the generated samples is very close to the true conditional distribution in terms of total variation distance. Furthermore, we take the classifier-based conditional mutual information estimator as our test statistic. The test statistic as an empirical fundamental information theoretic quantity is able to well capture the conditional-dependence feature. We show that our proposed test is computationally very fast, while controlling type I and II errors quite well. Finally, we demonstrate the efficiency of our proposed test in both synthetic and real data analyses.

Downloads

Published

2023-06-26

How to Cite

Li, S., Chen, Z., Zhu, H., Wang, C. D., & Wen, W. (2023). Nearest-Neighbor Sampling Based Conditional Independence Testing. Proceedings of the AAAI Conference on Artificial Intelligence, 37(7), 8631-8639. https://doi.org/10.1609/aaai.v37i7.26039

Issue

Section

AAAI Technical Track on Machine Learning II