“Reasoning” with Rhetoric: On the Style-Evidence Tradeoff in LLM-Generated Counter-Arguments

Authors

  • Preetika Verma National University of Singapore
  • Kokil Jaidka National University of Singapore
  • Svetlana Churina National University of Singapore

DOI:

https://doi.org/10.1609/icwsm.v19i1.35913

Abstract

Large language models (LLMs) play a key role in generating evidence-based and stylistic counter-arguments, yet their effectiveness in real-world applications has been underexplored. Previous research often neglects the balance between evidentiality and style, which are crucial for persuasive arguments. To address this, we evaluated the effectiveness of stylized evidence-based counter-argument generation in Counterfire, a new dataset of 38,000 counter-arguments generated by revising counter-arguments to Reddit’s ChangeMyView community to follow different discursive styles. We evaluated generic and stylized counter-arguments from basic and fine-tuned models such as GPT-3.5, PaLM-2, and Koala-13B, as well as newer models (GPT-4o, Claude Haiku, LLaMA-3.1) focusing on rhetorical quality and persuasiveness. Our findings reveals that humans prefer stylized counter-arguments over the original outputs, with GPT-3.5 Turbo performing well, though still not reaching human standards of rhetorical quality nor persuasiveness. Additionally, our work created a novel argument triplets dataset for studying style control, with human preference labels that provide insights into the tradeoffs between evidence integration and argument quality.

Downloads

Published

2025-06-07

How to Cite

Verma, P., Jaidka, K., & Churina, S. (2025). “Reasoning” with Rhetoric: On the Style-Evidence Tradeoff in LLM-Generated Counter-Arguments. Proceedings of the International AAAI Conference on Web and Social Media, 19(1), 1966–1989. https://doi.org/10.1609/icwsm.v19i1.35913