Cross-Modal Stealth: A Coarse-to-Fine Attack Framework for RGB-T Tracker

Authors

  • Xinyu Xiang Wuhan University
  • Qinglong Yan Wuhan University
  • Hao Zhang Wuhan University
  • Jianfeng Ding Wuhan University
  • Han Xu Southeast University
  • Zhongyuan Wang Wuhan University
  • Jiayi Ma Wuhan University

DOI:

https://doi.org/10.1609/aaai.v39i8.32931

Abstract

Current research on adversarial attacks mainly focuses on RGB trackers, with no existing methods for attacking RGB-T cross-modal trackers. To fill this gap and overcome its challenges, we propose a progressive adversarial patch generation framework and achieve cross-modal stealth. On the one hand, we design a coarse-to-fine architecture grounded in the latent space to progressively and precisely uncover the vulnerabilities of RGB-T trackers. On the other hand, we introduce a correlation-breaking loss that disrupts the modal coupling within trackers, spanning from the pixel to the semantic level. These two design elements ensure that the proposed method can overcome the obstacles posed by cross-modal information complementarity in implementing attacks. Furthermore, to enhance the reliable application of the adversarial patches in real world, we develop a point tracking-based reprojection strategy that effectively mitigates performance degradation caused by multi-angle distortion during imaging. Extensive experiments demonstrate the superiority of our method.

Published

2025-04-11

How to Cite

Xiang, X., Yan, Q., Zhang, H., Ding, J., Xu, H., Wang, Z., & Ma, J. (2025). Cross-Modal Stealth: A Coarse-to-Fine Attack Framework for RGB-T Tracker. Proceedings of the AAAI Conference on Artificial Intelligence, 39(8), 8620-8627. https://doi.org/10.1609/aaai.v39i8.32931

Issue

Section

AAAI Technical Track on Computer Vision VII