Interpretations, Representations, and Stereotypes of Caste within Text-to-Image Generators
DOI:
https://doi.org/10.1609/aies.v7i1.31652Abstract
The surge in the popularity of text-to-image generators (T2Is) has been matched by extensive research into ensuring fairness and equitable outcomes, with a focus on how they impact society. However, such work has typically focused on globally-experienced identities or centered Western contexts. In this paper, we address interpretations, representations, and stereotypes surrounding a tragically underexplored context in T2I research: caste. We examine how the T2I Stable Diffusion displays people of various castes, and what professions they are depicted as performing. Generating 100 images per prompt, we perform CLIP-cosine similarity comparisons with default depictions of an `Indian person’ by Stable Diffusion, and explore patterns of similarity. Our findings reveal how Stable Diffusion outputs perpetuate systems of `castelessness’, equating Indianness with high-castes and depicting caste-oppressed identities with markers of poverty. In particular, we note the stereotyping and representational harm towards the historically-marginalized Dalits, prominently depicted as living in rural areas and always at protests. Our findings underscore a need for a caste-aware approach towards T2I design, and we conclude with design recommendations.Downloads
Published
2024-10-16
How to Cite
Ghosh, S. (2024). Interpretations, Representations, and Stereotypes of Caste within Text-to-Image Generators. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 7(1), 490-502. https://doi.org/10.1609/aies.v7i1.31652
Issue
Section
Full Archival Papers