As an AI Language Model, "Yes I Would Recommend Calling the Police": Norm Inconsistency in LLM Decision-Making
DOI:
https://doi.org/10.1609/aies.v7i1.31665Abstract
We investigate the phenomenon of norm inconsistency: where LLMs apply different norms in similar situations. Specifically, we focus on the high-risk application of deciding whether to call the police in Amazon Ring home surveillance videos. We evaluate the decisions of three state-of-the-art LLMs — GPT-4, Gemini 1.0, and Claude 3 Sonnet — in relation to the activities portrayed in the videos, the subjects' skin-tone and gender, and the characteristics of the neighborhoods where the videos were recorded. Our analysis reveals significant norm inconsistencies: (1) a discordance between the recommendation to call the police and the actual presence of criminal activity, and (2) biases influenced by the racial demographics of the neighborhoods. These results highlight the arbitrariness of model decisions in the surveillance context and the limitations of current bias detection and mitigation strategies in normative decision-making.Downloads
Published
2024-10-16
How to Cite
Jain, S., Calacci, D., & Wilson, A. (2024). As an AI Language Model, "Yes I Would Recommend Calling the Police": Norm Inconsistency in LLM Decision-Making. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 7(1), 624-633. https://doi.org/10.1609/aies.v7i1.31665
Issue
Section
Full Archival Papers