LLM Safety in Judicial AI: A Stress Test of Social Media Influence on Real-World Judgments

Authors

  • Yixuan Xie The Hong Kong University of Science and Technology
  • Yang He University of Macau
  • Xiaoyu Yang The Hong Kong University of Science and Technology (Guangzhou)
  • Xu Gai University of Bonn
  • Pan Hui The Hong Kong University of Science and Technology (Guangzhou)

DOI:

https://doi.org/10.1609/aaai.v40i46.41297

Abstract

Integrating Large Language Models (LLMs) into judicial decision-making demands rigorous safety examination against non-legal influences. This paper presents a novel stress test where we evaluate LLM-generated labor dispute outcomes by introducing social media sentiment as an external pressure, critically comparing them against 10,000 real-world court judgments from China Judgments Online (CJOL). Our findings reveal significant LLM safety vulnerabilities: models exhibit inherent deviations from real rulings, and public opinion substantially amplifies these discrepancies, leading to unstable and often inflated compensation predictions. Furthermore, these safety risks are compounded across low-skilled occupational categories and emotionally charged topics. This study uncovers critical threats to judicial integrity and public trust, underscoring the urgent need for robust safeguards against non-legal influences in AI legal systems.

Downloads

Published

2026-03-14

How to Cite

Xie, Y., He, Y., Yang, X., Gai, X., & Hui, P. (2026). LLM Safety in Judicial AI: A Stress Test of Social Media Influence on Real-World Judgments. Proceedings of the AAAI Conference on Artificial Intelligence, 40(46), 39468–39476. https://doi.org/10.1609/aaai.v40i46.41297