GermanPartiesQA: Benchmarking Commercial Large Language Models and AI Companions for Political Alignment and Sycophancy

Authors

  • Jan Batzner Weizenbaum Institute Technical University Munich
  • Volker Stocker Weizenbaum Institute Technical University Berlin
  • Stefan Schmid Weizenbaum Institute Technical University Berlin
  • Gjergji Kasneci Technical University Munich

DOI:

https://doi.org/10.1609/aies.v8i1.36552

Abstract

Large language models (LLMs) are increasingly shaping citizens’ information ecosystems. Products incorporating LLMs, such as chatbots and AI Companions, are now widely used for decision support and information retrieval, including in sensitive domains, raising concerns about hidden biases and growing potential to shape individual decisions and public opinion. This paper introduces GermanPartiesQA, a benchmark of 418 political statements from German Voting Advice Applications across 11 elections to evaluate six commercial LLMs. We evaluate their political alignment based on role-playing experiments with political personas. Our evaluation reveals three specific findings: (1) Factual limitations: LLMs show limited ability to accurately generate factual party positions, particularly for centrist parties. (2) Model-specific ideological alignment: We identify consistent alignment patterns and degree of political steerability for each model across temperature settings and experiments. (3) Claim of sycophancy: While models adjust to political personas during role-play, we find this reflects persona-based steerability rather than the increasingly popular, yet contested concept of sycophancy. Our study contributes to evaluating the political alignment of closed-source LLMs that are increasingly embedded in electoral decision support tools and AI Companion chatbots.

Downloads

Published

2025-10-15

How to Cite

Batzner, J., Stocker, V., Schmid, S., & Kasneci, G. (2025). GermanPartiesQA: Benchmarking Commercial Large Language Models and AI Companions for Political Alignment and Sycophancy. Proceedings of the AAAI ACM Conference on AI, Ethics, and Society, 8(1), 330–342. https://doi.org/10.1609/aies.v8i1.36552