ARGH-Mark: Anchor-Synchronized Watermarking with Hamming Correction for Robust and Quality-Preserving LLM Attribution

Authors

  • He Li Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China State Key Laboratory of Cyberspace Security Defense, Beijing, China School of Cyber Security,University of Chinese Academy of Sciences, Beijing, China
  • Xiaojun Chen Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China State Key Laboratory of Cyberspace Security Defense, Beijing, China School of Cyber Security,University of Chinese Academy of Sciences, Beijing, China
  • Jingcheng He Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China State Key Laboratory of Cyberspace Security Defense, Beijing, China School of Cyber Security,University of Chinese Academy of Sciences, Beijing, China
  • Zhendong Zhao Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China State Key Laboratory of Cyberspace Security Defense, Beijing, China
  • Shuguang Yuan Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China State Key Laboratory of Cyberspace Security Defense, Beijing, China
  • Xin Zhao Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China State Key Laboratory of Cyberspace Security Defense, Beijing, China School of Cyber Security,University of Chinese Academy of Sciences, Beijing, China
  • Yunfei Yang Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China State Key Laboratory of Cyberspace Security Defense, Beijing, China School of Cyber Security,University of Chinese Academy of Sciences, Beijing, China

DOI:

https://doi.org/10.1609/aaai.v40i44.41092

Abstract

The proliferation of large language models has intensified demands for reliable content attribution, yet existing watermarking techniques face a fundamental trilemma: they cannot simultaneously optimize for robustness against attacks, minimal text quality degradation, and detection efficiency. To resolve this challenge, we propose ARGH-Mark, a novel watermarking framework that integrates three synergistic innovations: (1) Anchor-synchronized phase recovery for maintaining detection integrity under insertion/deletion attacks, (2) RG-balanced vocabulary modulation that dynamically partitions lexicons via contextual hashing to preserve generation quality, and (3) Hamming-based error correction enabling single-bit error rectification through algebraic coding. Comprehensive evaluations across question answering (ELI5), summarization (CNN/DailyMail), and text generation (C4) demonstrate state-of-the-art performance: the proposed ARGH-Mark framework achieves near-perfect match rate and bit accuracy across diverse configurations, while preserving the quality of the generated text. It significantly reduces detection latency, enabling real-time extraction, and maintains high robustness against token tampering attacks through integrated Hamming error correction, ensuring reliable attribution in adversarial settings. ARGH-Mark achieves a new Pareto frontier in the watermarking design space and advances trustworthy deployment of generative AI in alignment-critical applications.

Published

2026-03-14

How to Cite

Li, H., Chen, X., He, J., Zhao, Z., Yuan, S., Zhao, X., & Yang, Y. (2026). ARGH-Mark: Anchor-Synchronized Watermarking with Hamming Correction for Robust and Quality-Preserving LLM Attribution. Proceedings of the AAAI Conference on Artificial Intelligence, 40(44), 37583–37590. https://doi.org/10.1609/aaai.v40i44.41092

Issue

Section

AAAI Special Track on AI Alignment