Fact2Fiction: Targeted Poisoning Attack to Agentic Fact-checking System

Haorui He; Yupeng Li; Bin Benjamin Zhu; Dacheng Wen; Reynold Cheng; Francis C. M. Lau

doi:10.1609/aaai.v40i37.40353

Authors

Haorui He Department of Interactive Media, Hong Kong Baptist University School of Computing and Data Science, The University of Hong Kong
Yupeng Li Department of Interactive Media, Hong Kong Baptist University
Bin Benjamin Zhu Microsoft Corporation
Dacheng Wen Department of Interactive Media, Hong Kong Baptist University School of Computing and Data Science, The University of Hong Kong
Reynold Cheng School of Computing and Data Science, The University of Hong Kong
Francis C. M. Lau School of Computing and Data Science, The University of Hong Kong

DOI:

https://doi.org/10.1609/aaai.v40i37.40353

Abstract

State-of-the-art (SOTA) fact-checking systems combat misinformation by employing autonomous LLM-based agents to decompose complex claims into smaller sub-claims, verify each sub-claim individually, and aggregate the partial results to produce verdicts with justifications (explanations for the verdicts). The security of these systems is crucial, as compromised fact-checkers can amplify misinformation, but remains largely underexplored. To bridge this gap, this work introduces a novel threat model against such fact-checking systems and presents Fact2Fiction, the first poisoning attack framework targeting SOTA agentic fact-checking systems. Fact2Fiction employs LLMs to mimic the decomposition strategy and exploit system-generated justifications to craft tailored malicious evidences that compromise sub-claim verification. Extensive experiments demonstrate that Fact2Fiction achieves 8.9%-21.2% higher attack success rates than SOTA attacks across various poisoning budgets and exposes security weaknesses in existing fact-checking systems, highlighting the need for defensive countermeasures.

Fact2Fiction: Targeted Poisoning Attack to Agentic Fact-checking System

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information