Bias Association Discovery Framework for Open-Ended LLM Generations

Jinhao Pan; Chahat Raj; Ziwei Zhu

doi:10.1609/aaai.v40i38.40541

Authors

Jinhao Pan George Mason University
Chahat Raj George Mason University
Ziwei Zhu George Mason University

DOI:

https://doi.org/10.1609/aaai.v40i38.40541

Abstract

Social biases embedded in Large Language Models (LLMs) raise critical concerns, resulting in representational harms -- unfair or distorted portrayals of demographic groups -- that may be expressed in subtle ways through generated language. Existing evaluation methods often depend on predefined identity-concept associations, limiting their ability to surface new or unexpected forms of bias. In this work, we present the Bias Association Discovery Framework (BADF), a systematic approach for extracting both known and previously unrecognized associations between demographic identities and descriptive concepts from open-ended LLM outputs. Through comprehensive experiments spanning multiple models and diverse real-world contexts, BADF enables robust mapping and analysis of the varied concepts that characterize demographic identities. Our findings advance the understanding of biases in open-ended generation and provide a scalable tool for identifying and analyzing bias associations in LLMs.

Bias Association Discovery Framework for Open-Ended LLM Generations

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information