Imbalance-Aware Uplift Modeling for Observational Data

Xuanying Chen; Zhining Liu; Li Yu; Liuyi Yao; Wenpeng Zhang; Yi Dong; Lihong Gu; Xiaodong Zeng; Yize Tan; Jinjie Gu

doi:10.1609/aaai.v36i6.20581

Authors

Xuanying Chen Ant Group
Zhining Liu Ant Group
Li Yu Ant Group
Liuyi Yao Alibaba Group
Wenpeng Zhang Ant Group
Yi Dong Ant Group
Lihong Gu Ant Group
Xiaodong Zeng Ant Group
Yize Tan Ant Group
Jinjie Gu Ant Group

DOI:

https://doi.org/10.1609/aaai.v36i6.20581

Keywords:

Machine Learning (ML), Data Mining & Knowledge Management (DMKM)

Abstract

Uplift modeling aims to model the incremental impact of a treatment on an individual outcome, which has attracted great interests of researchers and practitioners from different communities. Existing uplift modeling methods rely on either the data collected from randomized controlled trials (RCTs) or the observational data which is more realistic. However, we notice that on the observational data, it is often the case that only a small number of subjects receive treatment, but finally infer the uplift on a much large group of subjects. Such highly imbalanced data is common in various fields such as marketing and medical treatment but it is rarely handled by existing works. In this paper, we theoretically and quantitatively prove that the existing representative methods, transformed outcome (TOM) and doubly robust (DR), suffer from large bias and deviation on highly imbalanced datasets with skewed propensity scores, mainly because they are proportional to the reciprocal of the propensity score. To reduce the bias and deviation of uplift modeling with an imbalanced dataset, we propose an imbalance-aware uplift modeling (IAUM) method via constructing a robust proxy outcome, which adaptively combines the doubly robust estimator and the imputed treatment effects based on the propensity score. We theoretically prove that IAUM can obtain a better bias-variance trade-off than existing methods on a highly imbalanced dataset. We conduct extensive experiments on a synthetic dataset and two real-world datasets, and the experimental results well demonstrate the superiority of our method over state-of-the-art.

Imbalance-Aware Uplift Modeling for Observational Data

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription