Infrared-Privileged UAV Detection via Cross-Modal Vector-Quantization

Authors

  • Zhibo Lou Jiangxi University of Finance and Economics
  • Ruijie Zhang Jiangxi University of Finance and Economics
  • Zeyu Luo Jiangxi University of Finance and Economics
  • Qianxi Cao Jiangxi University of Finance and Economics
  • Feng Qian Jiangxi University of Finance and Economics
  • Junjie Chen Jiangxi University of Finance and Economics
  • Yuming Fang Jiangxi University of Finance and Economics

DOI:

https://doi.org/10.1609/aaai.v40i9.37692

Abstract

RGB and infrared images has shown remarkable robustness for object detection based on unmanned aerial vehicles (UAV). However, the primitive RGB and infrared (IR) images are inevitably misaligned due to the device gap between RGB and infrared cameras. Most existing methods rely on manually filtered and aligned images, and thus are limited in real-world application. Some recent methods tend to directly learn from misaligned images, which only weakly benefit from the multi-modality and may be misled by dramatically misaligned IR images. Considering that the manually aligned images are available during training while unavailable in inference, we explore a new learning paradigm using the IR modality as privileged information. In the training stage, our model learns to hallucinate the complementary knowledge in IR modality based on RGB modality. In inference, our model could hallucinate the complementary IR modality to facilitate UAV detection. Specifically, we propose to quantize the IR features and hallucinate the codebook-indices based on RGB features, which is more effective and robust than directly hallucinating features. In addition, we propose to hierarchically hallucinate multi-scale codebook-indices, which could further improve the hallucinating quality. Experiments on DroneVehicle and VisDrone datasets demonstrate the effectiveness of our method.

Downloads

Published

2026-03-14

How to Cite

Lou, Z., Zhang, R., Luo, Z., Cao, Q., Qian, F., Chen, J., & Fang, Y. (2026). Infrared-Privileged UAV Detection via Cross-Modal Vector-Quantization. Proceedings of the AAAI Conference on Artificial Intelligence, 40(9), 7521–7529. https://doi.org/10.1609/aaai.v40i9.37692

Issue

Section

AAAI Technical Track on Computer Vision VI