Discovering Linear Non-Gaussian Models for All Categories of Missing Data (Student Abstract)

Authors

  • Matteo Ceriscioli Oregon State University RIKEN
  • Shohei Shimizu RIKEN The University of Osaka Shiga University
  • Karthika Mohan Oregon State University

DOI:

https://doi.org/10.1609/aaai.v40i48.42195

Abstract

Causal discovery is the task of learning causal models, encoding causal relationships, from a source of information, such as a dataset containing observational data. While many algorithms have been developed to discover causal models under varied sets of assumptions, the case in which the dataset is affected by missing data remains significantly underexplored. Naively applying standard causal discovery algorithms to listwise, test-wise, or regression-wise deleted datasets, or imputing the missing data, can introduce spurious associations between variables and bias function estimation in functional causal models. This issue arises when the data is missing at random or not at random. It ultimately invalidates the theoretical guarantees of these algorithms and prevents finding the true underlying causal model, even in the large-sample limit. An established family of causal models is the Linear Non-Gaussian Acyclic Model (LiNGAM), which assumes linear functional relationships and non-Gaussian independent noise terms. We propose a new causal discovery algorithm for LiNGAM, capable of recovering the underlying causal structure and providing unbiased estimates of the model’s parameters, even when the data is affected by MNAR missingness.

Downloads

Published

2026-03-14

How to Cite

Ceriscioli, M., Shimizu, S., & Mohan, K. (2026). Discovering Linear Non-Gaussian Models for All Categories of Missing Data (Student Abstract). Proceedings of the AAAI Conference on Artificial Intelligence, 40(48), 41151–41153. https://doi.org/10.1609/aaai.v40i48.42195