MSAT-LDM: Toward Transferable High-Fidelity Watermarking for Latent Diffusion Model via Modular Self-Augmented Training
DOI:
https://doi.org/10.1609/aaai.v40i42.40921Abstract
The rapid proliferation of AI-generated images necessitates effective watermarking techniques to protect intellectual property and detect fraudulent content. While existing training-based watermarking methods show promise, they often struggle with generalization across diverse prompts, introduce visible artifacts, and require substantial external data for retraining on new model variants. To this end, we propose Modular Self-Augmented Training for Latent Diffusion Models (MSAT-LDM), a novel and transferable watermarking framework. MSAT-LDM integrates two key components: (1) Self-Augmented Training (SAT) leverages an internally generated "free generation" distribution to train the watermark module, aligning the training and testing phases without relying on external data. We theoretically demonstrate that this design improves generalization by inducing a tighter generalization bound. (2) Modular watermark architecture is a plug-and-play module that can be independently fine-tuned, enabling efficient adaptation to various fine-tuned backbones or LoRA-enhanced variants with minimal overhead. Extensive experiments show that MSAT-LDM achieves robust watermarking, significantly improves the quality of watermarked images across diverse prompts, and exhibits strong transfer performance--all without the need for external training data.Downloads
Published
2026-03-14
How to Cite
Zhang, L., & Zeng, L. (2026). MSAT-LDM: Toward Transferable High-Fidelity Watermarking for Latent Diffusion Model via Modular Self-Augmented Training. Proceedings of the AAAI Conference on Artificial Intelligence, 40(42), 36048–36056. https://doi.org/10.1609/aaai.v40i42.40921
Issue
Section
AAAI Technical Track on Philosophy and Ethics of AI