DarkBench+: An Extended Benchmark for Evaluating Dark Patterns in Large Language Models

Authors

  • Yaowen Liu China People's Police University
  • Shenjia Jing China People's Police University
  • Yufei Wei China People's Police University
  • Shoumin Zhang China People's Police University
  • Jinglu Zhang East China Normal University
  • Zhen Mei China People's Police University
  • Liangliang Yue China People's Police University
  • Jiarui Wang China People's Police University
  • Peng Zhang China People's Police University

DOI:

https://doi.org/10.1609/aaai.v40i44.41103

Abstract

With the widespread deployment of large language models (LLMs) in human-computer interaction, dark patterns have extended from traditional visual interfaces to conversational AI systems. While existing research has confirmed the prevalence of dark patterns in LLMs, current evaluation benchmarks face critical challenges including limited classification coverage, overlooked risks specific to reasoning models, and inadequate consideration of cross-linguistic differences. To address these limitations, we propose DarkBench+, an extended benchmark for evaluating dark patterns in LLMs. We construct an expanded taxonomy containing 10 major categories and 24 subcategories, introduce an annotation workflow combining manual and automated methods, and design 2,088 bilingual test samples in Chinese and English. This benchmark is the first to develop specialized evaluation dimensions for reasoning models and systematically evaluates dark pattern behaviors across nearly 40 mainstream LLMs. Experimental results demonstrate significant manipulation risks in reasoning models' transparency displays, while cross-linguistic evaluation analyzes AI manipulation behavior differences across different linguistic environments, promoting more ethical and responsible LLM development.

Downloads

Published

2026-03-14

How to Cite

Liu, Y., Jing, S., Wei, Y., Zhang, S., Zhang, J., Mei, Z., … Zhang, P. (2026). DarkBench+: An Extended Benchmark for Evaluating Dark Patterns in Large Language Models. Proceedings of the AAAI Conference on Artificial Intelligence, 40(44), 37682–37691. https://doi.org/10.1609/aaai.v40i44.41103

Issue

Section

AAAI Special Track on AI Alignment