HEV Generative Sandbox: A Framework for Assessing Domain-Specific Social Risks Through Human-LLM Simulation

Yiran Liu; Zhiyi Hou; Xiaoang Xu; Shuo Wang; Huijia Wu; Kaicheng Yu; Yang Yu; ChengXiang Zhai

doi:10.1609/aaai.v40i38.40498

Authors

Yiran Liu Tsinghua University, Tsinghua University
Zhiyi Hou Westlake University Zhejiang University Shanghai Innovation Institute
Xiaoang Xu Beijing University of Posts and Telecommunications
Shuo Wang Tsinghua University, Tsinghua University
Huijia Wu Beijing University of Posts and Telecommunications
Kaicheng Yu Westlake University
Yang Yu China University of Petroleum (Beijing)
ChengXiang Zhai University of Illinois, Urbana Champaign

DOI:

https://doi.org/10.1609/aaai.v40i38.40498

Abstract

Deploying Large Language Models (LLMs) in specialized domains introduces significant societal and compliance risks, including bias amplification, misinformation propagation, and privacy violations. These risks predominantly emerge from the dynamic interactions between LLMs and humans in specific contexts. Different domains face unique distribution of hazards, and varying interaction modalities introduce distinct levels of exposure and vulnerability. However, current risk assessment frameworks lack a systematic methodology to capture this dynamic interplay. In this work, we introduce the HEV Generative Sandbox, a novel risk evaluation framework that simulates human-LLM behavior to quantify domain-contextual risks across three interdependent dimensions: 1) Hazard (H): Domain-specific threats inherent to a given context; 2) Exposure (E): The extent to which the LLM and its users are subjected to hazardous scenarios; 3) Vulnerability (V): The susceptibility of the system to risk due to human interaction or model weaknesses. Our approach pioneers "domain-rooted scenario generation", wherein we sample contextual distributions from domain-specific corpora and simulate diverse inputs. By unifying dynamic scenario simulation, causal risk decomposition, and closed-loop evaluation, the HEV Generative Sandbox provides a scalable, domain-sensitive methodology for responsible LLM deployment. This work contributes to advancing the safe deployment of LLMs by providing a comprehensive and automated risk evaluation framework.

HEV Generative Sandbox: A Framework for Assessing Domain-Specific Social Risks Through Human-LLM Simulation

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information