Counterfactual Fairness with Disentangled Causal Effect Variational Autoencoder

Authors

  • Hyemi Kim Korea Advanced Institute of Science and Technology
  • Seungjae Shin Korea Advanced Institute of Science and Technology
  • JoonHo Jang Korea Advanced Institute of Science and Technology
  • Kyungwoo Song University of Seoul
  • Weonyoung Joo Korea Advanced Institute of Science and Technology
  • Wanmo Kang Korea Advanced Institute of Science and Technology
  • Il-Chul Moon Korea Advanced Institute of Science and Technology

DOI:

https://doi.org/10.1609/aaai.v35i9.16990

Keywords:

Ethics -- Bias, Fairness, Transparency & Privacy

Abstract

The problem of fair classification can be mollified if we develop a method to remove the embedded sensitive information from the classification features. This line of separating the sensitive information is developed through the causal inference, and the causal inference enables the counterfactual generations to contrast the what-if case of the opposite sensitive attribute. Along with this separation with the causality, a frequent assumption in the deep latent causal model defines a single latent variable to absorb the entire exogenous uncertainty of the causal graph. However, we claim that such structure cannot distinguish the 1) information caused by the intervention (i.e., sensitive variable) and 2) information correlated with the intervention from the data. Therefore, this paper proposes Disentangled Causal Effect Variational Autoencoder (DCEVAE) to resolve this limitation by disentangling the exogenous uncertainty into two latent variables: either 1) independent to interventions or 2) correlated to interventions without causality. Particularly, our disentangling approach preserves the latent variable correlated to interventions in generating counterfactual examples. We show that our method estimates the total effect and the counterfactual effect without a complete causal graph. By adding a fairness regularization, DCEVAE generates a counterfactual fair dataset while losing less original information. Also, DCEVAE generates natural counterfactual images by only flipping sensitive information. Additionally, we theoretically show the differences in the covariance structures of DCEVAE and prior works from the perspective of the latent disentanglement.

Downloads

Published

2021-05-18

How to Cite

Kim, H., Shin, S., Jang, J., Song, K., Joo, W., Kang, W., & Moon, I.-C. (2021). Counterfactual Fairness with Disentangled Causal Effect Variational Autoencoder. Proceedings of the AAAI Conference on Artificial Intelligence, 35(9), 8128-8136. https://doi.org/10.1609/aaai.v35i9.16990

Issue

Section

AAAI Technical Track on Machine Learning II