Purifier: Defending Data Inference Attacks via Transforming Confidence Scores

Authors

  • Ziqi Yang Zhejiang University ZJU-Hangzhou Global Scientific and Technological Innovation Center Key Laboratory of Blockchain and Cyberspace Governance of Zhejiang Province
  • Lijin Wang ZheJiang University
  • Da Yang Zhejiang University
  • Jie Wan Zhejiang University
  • Ziming Zhao Zhejiang University
  • Ee-Chien Chang National University of Singapore
  • Fan Zhang Zhejiang University Jiaxing Research Institute, Zhejiang University Zhengzhou Xinda Institute of Advanced Technology
  • Kui Ren Zhejiang University ZJU-Hangzhou Global Scientific and Technological Innovation Center Key Laboratory of Blockchain and Cyberspace Governance of Zhejiang Province Jiaxing Research Institute, Zhejiang University

DOI:

https://doi.org/10.1609/aaai.v37i9.26289

Keywords:

ML: Privacy-Aware ML, CV: Bias, Fairness & Privacy, PEAI: Privacy and Security

Abstract

Neural networks are susceptible to data inference attacks such as the membership inference attack, the adversarial model inversion attack and the attribute inference attack, where the attacker could infer useful information such as the membership, the reconstruction or the sensitive attributes of a data sample from the confidence scores predicted by the target classifier. In this paper, we propose a method, namely PURIFIER, to defend against membership inference attacks. It transforms the confidence score vectors predicted by the target classifier and makes purified confidence scores indistinguishable in individual shape, statistical distribution and prediction label between members and non-members. The experimental results show that PURIFIER helps defend membership inference attacks with high effectiveness and efficiency, outperforming previous defense methods, and also incurs negligible utility loss. Besides, our further experiments show that PURIFIER is also effective in defending adversarial model inversion attacks and attribute inference attacks. For example, the inversion error is raised about 4+ times on the Facescrub530 classifier, and the attribute inference accuracy drops significantly when PURIFIER is deployed in our experiment.

Downloads

Published

2023-06-26

How to Cite

Yang, Z., Wang, L., Yang, D., Wan, J., Zhao, Z., Chang, E.-C., Zhang, F., & Ren, K. (2023). Purifier: Defending Data Inference Attacks via Transforming Confidence Scores. Proceedings of the AAAI Conference on Artificial Intelligence, 37(9), 10871-10879. https://doi.org/10.1609/aaai.v37i9.26289

Issue

Section

AAAI Technical Track on Machine Learning IV