A Content-Preserving Secure Linguistic Steganography

Authors

  • Lingyun Xiang School of Computer Science and Technology, Changsha University of Science and Technology
  • Chengfu Ou College of Cyberspace Security, Jinan University
  • Xu He School of Computer Science and Technology, Changsha University of Science and Technology
  • Zhongliang Yang School of Cyberspace Security, Beijing University of Posts and Telecommunications
  • Yuling Liu College of Cyber Science and Technology, Hunan University

DOI:

https://doi.org/10.1609/aaai.v40i42.40903

Abstract

Existing linguistic steganography methods primarily rely on content transformations to conceal secret messages. However, they often cause subtle yet looking-innocent deviations between normal and stego texts, posing potential security risks in real-world applications. To address this challenge, we propose a content-preserving linguistic steganography paradigm for perfectly secure covert communication without modifying the cover text. Based on this paradigm, we introduce CLstega (Content-preserving Linguistic steganography), a novel method that embeds secret messages through controllable distribution transformation. CLstega first applies an augmented masking strategy to locate and mask embedding positions, where MLM (masked language model)-predicted probability distributions are easily adjustable for transformation. Subsequently, a dynamic distribution steganographic coding strategy is designed to encode secret messages by deriving target distributions from the original probability distributions. To achieve this transformation, CLstega elaborately selects target words for embedding positions as labels to construct a masked sentence dataset, which is used to fine-tune the original MLM, producing a target MLM capable of directly extracting secret messages from the cover text. This approach ensures perfect security of secret messages while fully preserving the integrity of the original cover text. Experimental results demonstrate that CLstega can achieve a 100% extraction success rate, and outperforms existing methods in security, effectively balancing embedding capacity and security.

Downloads

Published

2026-03-14

How to Cite

Xiang, L., Ou, C., He, X., Yang, Z., & Liu, Y. (2026). A Content-Preserving Secure Linguistic Steganography. Proceedings of the AAAI Conference on Artificial Intelligence, 40(42), 35885-35893. https://doi.org/10.1609/aaai.v40i42.40903

Issue

Section

AAAI Technical Track on Philosophy and Ethics of AI