As an AI Language Model, "Yes I Would Recommend Calling the Police": Norm Inconsistency in LLM Decision-Making

Authors

  • Shomik Jain Massachusetts Institute of Technology
  • D. Calacci Penn State University
  • Ashia Wilson Massachusetts Institute of Technology

DOI:

https://doi.org/10.1609/aies.v7i1.31665

Abstract

We investigate the phenomenon of norm inconsistency: where LLMs apply different norms in similar situations. Specifically, we focus on the high-risk application of deciding whether to call the police in Amazon Ring home surveillance videos. We evaluate the decisions of three state-of-the-art LLMs — GPT-4, Gemini 1.0, and Claude 3 Sonnet — in relation to the activities portrayed in the videos, the subjects' skin-tone and gender, and the characteristics of the neighborhoods where the videos were recorded. Our analysis reveals significant norm inconsistencies: (1) a discordance between the recommendation to call the police and the actual presence of criminal activity, and (2) biases influenced by the racial demographics of the neighborhoods. These results highlight the arbitrariness of model decisions in the surveillance context and the limitations of current bias detection and mitigation strategies in normative decision-making.

Downloads

Published

2024-10-16

How to Cite

Jain, S., Calacci, D., & Wilson, A. (2024). As an AI Language Model, "Yes I Would Recommend Calling the Police": Norm Inconsistency in LLM Decision-Making. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 7(1), 624-633. https://doi.org/10.1609/aies.v7i1.31665