Bias Association Discovery Framework for Open-Ended LLM Generations

Authors

  • Jinhao Pan George Mason University
  • Chahat Raj George Mason University
  • Ziwei Zhu George Mason University

DOI:

https://doi.org/10.1609/aaai.v40i38.40541

Abstract

Social biases embedded in Large Language Models (LLMs) raise critical concerns, resulting in representational harms -- unfair or distorted portrayals of demographic groups -- that may be expressed in subtle ways through generated language. Existing evaluation methods often depend on predefined identity-concept associations, limiting their ability to surface new or unexpected forms of bias. In this work, we present the Bias Association Discovery Framework (BADF), a systematic approach for extracting both known and previously unrecognized associations between demographic identities and descriptive concepts from open-ended LLM outputs. Through comprehensive experiments spanning multiple models and diverse real-world contexts, BADF enables robust mapping and analysis of the varied concepts that characterize demographic identities. Our findings advance the understanding of biases in open-ended generation and provide a scalable tool for identifying and analyzing bias associations in LLMs.

Published

2026-03-14

How to Cite

Pan, J., Raj, C., & Zhu, Z. (2026). Bias Association Discovery Framework for Open-Ended LLM Generations. Proceedings of the AAAI Conference on Artificial Intelligence, 40(38), 32637–32645. https://doi.org/10.1609/aaai.v40i38.40541

Issue

Section

AAAI Technical Track on Natural Language Processing III