PrivacyAlert: A Dataset for Image Privacy Prediction

Authors

  • Chenye Zhao Department of Computer Science, University of Illinois at Chicago
  • Jasmine Mangat College of Information and Computer Sciences, University of Massachusetts Amherst
  • Sujay Koujalgi College of Information Science and Technology, Pennsylvania State University
  • Anna Squicciarini College of Information Science and Technology, Pennsylvania State University
  • Cornelia Caragea Department of Computer Science, University of Illinois at Chicago

Keywords:

Web and Social Media

Abstract

Image privacy issues have become an important challenge as millions of images are being shared on social networking sites every day. Often due to users' lack of privacy awareness and social pressure, users' posted images reveal sensitive information and may be easily used to their detriment. To address these issues, several recent studies have proposed machine learning models to automatically identify whether an image contains private information. However, progress on this important task has been hampered by the absence of reliable, publicly available, up-to-date datasets. To this end, we introduce PrivacyAlert, a dataset developed from recent images extracted from Flickr and annotated with privacy labels (private or public). Our data collection process is based on state-of-the-art privacy taxonomy and captures a comprehensive set of image types of various sensitivity. We perform a comprehensive analysis of our dataset and report image privacy prediction results using classic and deep learning models to set the ground for future studies. Our dataset is publicly available at: https://doi.org/10.5281/zenodo.6406870.

Downloads

Published

2022-05-31

How to Cite

Zhao, C., Mangat, J., Koujalgi, S., Squicciarini, A., & Caragea, C. (2022). PrivacyAlert: A Dataset for Image Privacy Prediction. Proceedings of the International AAAI Conference on Web and Social Media, 16(1), 1352-1361. Retrieved from https://ojs.aaai.org/index.php/ICWSM/article/view/19387