Impute Missing Entries with Uncertainty

Authors

  • Jaesung Lim Department of Statistics and Data Science, University of Seoul, S. Korea
  • Seunghwan An Department of Information and Telecommunication Engineering, Incheon National University, S. Korea
  • Jong-June Jeon Department of Statistics, University of Seoul, S. Korea

DOI:

https://doi.org/10.1609/aaai.v40i28.39523

Abstract

Missing data presents a widespread challenge in real-world data collection. In this paper, our goal is to impute missing entries while accurately reflecting the uncertainty associated with them. We introduce U-VAE, a method that employs a non-parametric distributional learning strategy to parameterize the likelihood of missing values. To address the infeasibility of directly estimating the underlying conditional distributions due to data incompleteness, we incorporate stochastic re-masking and un-masking techniques during training. Specifically, we replace the conventional reconstruction loss with the continuous ranked probability score (CRPS), a strictly proper scoring rule, and theoretically demonstrate that the discrepancy between the underlying conditional distribution and our imputer is upper-bounded. We evaluate the performance of U-VAE on 11 real-world datasets, showing its effectiveness in both single and multiple imputations, while also enhancing post-imputation performance and supporting valid statistical inference.

Downloads

Published

2026-03-14

How to Cite

Lim, J., An, S., & Jeon, J.-J. (2026). Impute Missing Entries with Uncertainty. Proceedings of the AAAI Conference on Artificial Intelligence, 40(28), 23514–23522. https://doi.org/10.1609/aaai.v40i28.39523

Issue

Section

AAAI Technical Track on Machine Learning V