DRFGD: Disentangled Representation-Focused Generative Defense for Attack-Tolerant Cross-Modal Hashing

Zhongqing Yu; Xin Liu; Yiu-ming Cheung; Zhikai Hu; Wentao Fan; Pan Zhou

doi:10.1609/aaai.v40i19.38659

Authors

Zhongqing Yu Department of Computer Science, Huaqiao University Fujian Key Lab. of Big Data Intell. and Security & Xiamen CVPR Key Laboratory
Xin Liu Department of Computer Science, Huaqiao University Department of Computer Science, Hong Kong Baptist University
Yiu-ming Cheung Department of Computer Science, Hong Kong Baptist University
Zhikai Hu Department of Computer Science, Hong Kong Baptist University
Wentao Fan Department of Artificial Intelligence, Beijing Normal-Hong Kong Baptist University
Pan Zhou School of Cyber Science and Engineering, Huazhong University of Science and Technology

DOI:

https://doi.org/10.1609/aaai.v40i19.38659

Abstract

With the widespread deployment of cross-modal retrieval in real-world scenarios, ensuring robustness against adversarial attacks is increasingly critical. Remarkably, deep cross-modal hashing is highly vulnerable to adversarial attacks due to its discrete nature and low-dimensional hash codes, while existing defense methods often fail to suppress perturbations embedded in vulnerable features and lack the capacity to model modality-specific structural differences, resulting in suboptimal adversarial robustness. To address these challenges, we propose a novel Disentangled Representation-Focused Generative Defense (DRFGD) framework for attack-tolerant cross-modal hashing. Without altering the structure of retrieval model, DRFGD defends against adversarial attacks by disentangling input representations into adversarial-robust and adversarial-vulnerable components, by an efficient dual-branch semantic-aware encoder. Guided by such disentangled robust features, an attack-tolerant generative module is seamlessly designed to synthesize semantically aligned and perturbation-resilient examples for robust adversarial training, thereby significantly promoting collaborative defense robustness to attackers. Consequently, the semantically consistent hash codes can be well obtained to enhance adversarial robustness in complex cross-modal attacking scenarios. Extensive experiments on public benchmarks demonstrate that DRFGD substantially improves retrieval robustness under various attacking scenarios, and shows its improved defense performance in comparison with the SOTA works.

DRFGD: Disentangled Representation-Focused Generative Defense for Attack-Tolerant Cross-Modal Hashing

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information