ARGH-Mark: Anchor-Synchronized Watermarking with Hamming Correction for Robust and Quality-Preserving LLM Attribution

He Li; Xiaojun Chen; Jingcheng He; Zhendong Zhao; Shuguang Yuan; Xin Zhao; Yunfei Yang

doi:10.1609/aaai.v40i44.41092

Authors

He Li Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China State Key Laboratory of Cyberspace Security Defense, Beijing, China School of Cyber Security,University of Chinese Academy of Sciences, Beijing, China
Xiaojun Chen Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China State Key Laboratory of Cyberspace Security Defense, Beijing, China School of Cyber Security,University of Chinese Academy of Sciences, Beijing, China
Jingcheng He Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China State Key Laboratory of Cyberspace Security Defense, Beijing, China School of Cyber Security,University of Chinese Academy of Sciences, Beijing, China
Zhendong Zhao Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China State Key Laboratory of Cyberspace Security Defense, Beijing, China
Shuguang Yuan Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China State Key Laboratory of Cyberspace Security Defense, Beijing, China
Xin Zhao Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China State Key Laboratory of Cyberspace Security Defense, Beijing, China School of Cyber Security,University of Chinese Academy of Sciences, Beijing, China
Yunfei Yang Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China State Key Laboratory of Cyberspace Security Defense, Beijing, China School of Cyber Security,University of Chinese Academy of Sciences, Beijing, China

DOI:

https://doi.org/10.1609/aaai.v40i44.41092

Abstract

The proliferation of large language models has intensified demands for reliable content attribution, yet existing watermarking techniques face a fundamental trilemma: they cannot simultaneously optimize for robustness against attacks, minimal text quality degradation, and detection efficiency. To resolve this challenge, we propose ARGH-Mark, a novel watermarking framework that integrates three synergistic innovations: (1) Anchor-synchronized phase recovery for maintaining detection integrity under insertion/deletion attacks, (2) RG-balanced vocabulary modulation that dynamically partitions lexicons via contextual hashing to preserve generation quality, and (3) Hamming-based error correction enabling single-bit error rectification through algebraic coding. Comprehensive evaluations across question answering (ELI5), summarization (CNN/DailyMail), and text generation (C4) demonstrate state-of-the-art performance: the proposed ARGH-Mark framework achieves near-perfect match rate and bit accuracy across diverse configurations, while preserving the quality of the generated text. It significantly reduces detection latency, enabling real-time extraction, and maintains high robustness against token tampering attacks through integrated Hamming error correction, ensuring reliable attribution in adversarial settings. ARGH-Mark achieves a new Pareto frontier in the watermarking design space and advances trustworthy deployment of generative AI in alignment-critical applications.

ARGH-Mark: Anchor-Synchronized Watermarking with Hamming Correction for Robust and Quality-Preserving LLM Attribution

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information