Seeing Like an AI: How LLMs Apply (and Misapply) Wikipedia Neutrality Norms

Authors

  • Joshua Ashkinaze University of Michigan
  • Ruijia Guan University of Michigan
  • Laura Kurek University of Michigan
  • Eytan Adar University of Michigan
  • Ceren Budak University of Michigan
  • Eric Gilbert University of Michigan

DOI:

https://doi.org/10.1609/icwsm.v20i1.42630

Abstract

Large language models (LLMs) are trained on broad corpora and then used in communities with specialized norms and rules. But can LLMs apply community rules well enough to follow these norms? We evaluate LLMs’ capacity to detect (Task 1) and correct (Task 2) biased Wikipedia edits according to Wikipedia’s Neutral Point of View (NPOV) policy. LLMs struggled with bias detection, achieving only 64% accuracy on a balanced dataset. Models exhibited contrasting biases (some under- and others over-predicted bias), suggesting distinct priors about neutrality. LLMs performed better at bias correction, removing 79% of words removed by Wikipedia editors. However, LLMs made additional changes beyond Wikipedia editors’ simpler neutralizations, resulting in high-recall but low-precision editing. Interestingly, crowd-workers rated AI rewrites as more neutral (70%) and fluent- sounding (61%) than Wikipedia-editor rewrites. Qualitative analysis found that LLMs sometimes applied NPOV more comprehensively than Wikipedia editors but often made extraneous non-NPOV-related changes (e.g., grammar). LLMs may apply rules in ways that resonate with the public, but diverge from community experts. While potentially effective for generation, LLMs may reduce editor agency and increase moderation workload (e.g., verifying additions). Even when rules are easy to articulate, having LLMs apply them like community members may still be difficult.

Downloads

Published

2026-05-25

How to Cite

Ashkinaze, J., Guan, R., Kurek, L., Adar, E., Budak, C., & Gilbert, E. (2026). Seeing Like an AI: How LLMs Apply (and Misapply) Wikipedia Neutrality Norms. Proceedings of the International AAAI Conference on Web and Social Media, 20(1), 146–173. https://doi.org/10.1609/icwsm.v20i1.42630