A Labeled Dataset for Investigating Cyberbullying Content Patterns in Instagram


  • Mara Hamlett Arizona State Universtity
  • Grace Powell Arizona State University
  • Yasin N. Silva Loyola University Chicago
  • Deborah Hall Arizona State University




Organizational and group behavior mediated by social media; interpersonal communication mediated by social media, Psychological, personality-based and ethnographic studies of social media, Qualitative and quantitative studies of social media, Text categorization; topic recognition; demographic/gender/age identification


As online communication continues to become more prevalent, instances of cyberbullying have also become more common, particularly on social media sites. Previous research in this area has studied cyberbullying outcomes, predictors of cyberbullying victimization/perpetration, and computational detection models that rely on labeled datasets to identify the underlying patterns. However, there is a dearth of work examining the content of what is said when cyberbullying occurs and most of the available datasets include only basic labels (cyberbullying or not). This paper presents an annotated Instagram dataset with detailed labels about key cyberbullying properties, such as the content type, purpose, directionality, and co-occurrence with other phenomena, as well as demographic information about the individuals who performed the annotations. Additionally, results of an exploratory logistic regression analysis are reported to illustrate how new insights about cyberbullying and its automatic detection can be gained from this labeled dataset.




How to Cite

Hamlett, M., Powell, G., Silva, Y. N., & Hall, D. (2022). A Labeled Dataset for Investigating Cyberbullying Content Patterns in Instagram. Proceedings of the International AAAI Conference on Web and Social Media, 16(1), 1251-1258. https://doi.org/10.1609/icwsm.v16i1.19376