Self-Prompting Analogical Reasoning for UAV Object Detection

Authors

  • Nianxin Li School of Computer Science and Engineering, University of Electronic Science and Technology of China, China
  • Mao Ye School of Computer Science and Engineering, University of Electronic Science and Technology of China, China
  • Lihua Zhou School of Computer Science and Engineering, University of Electronic Science and Technology of China, China
  • Song Tang Institute of Machine Intelligence (IMI), University of Shanghai for Science and Technology, China
  • Yan Gan College of Computer Science, Chongqing University, China
  • Zizhuo Liang University of Sheffield
  • Xiatian Zhu University of Surrey

DOI:

https://doi.org/10.1609/aaai.v39i17.34026

Abstract

Unmanned Aerial Vehicle Object Detection (UAVOD) presents unique challenges due to varying altitudes, dynamic backgrounds, and the small size of objects. Traditional detection methods often struggle with these challenges, as they typically rely on visual feature only and fail to extract the semantic relations between the objects. To address these limitations, we propose a novel approach named Self-Prompting Analogical Reasoning (SPAR). Our method utilizes the vision-language model (CLIP) to generate context-aware prompts based on image feature, providing rich semantic information that guides analogical reasoning. SPAR includes two main modules: self-prompting and analogical reasoning. Self-prompting module based on learnable description and CLIP-text encoder generates context-aware prompt by combining specific image feature; then an objectness prompt score map is produced by computing the similarity between pixel-level features and context-aware prompt. With this score map, multi-scale image features are enhanced and pixel-level features are chosen for graph construction. While for analogical reasoning module, graph nodes consists of category-level prompt nodes and pixel-level image feature nodes. Analogical inference is based graph convolution. Under the guidance of category-level nodes, different-scale object features have been enhanced, which helps achieve more accurate detection of challenging objects. Extensive experiments illustrate that SPAR outperforms traditional methods, offering a more robust and accurate solution for UAVOD.

Downloads

Published

2025-04-11

How to Cite

Li, N., Ye, M., Zhou, L., Tang, S., Gan, Y., Liang, Z., & Zhu, X. (2025). Self-Prompting Analogical Reasoning for UAV Object Detection. Proceedings of the AAAI Conference on Artificial Intelligence, 39(17), 18412–18420. https://doi.org/10.1609/aaai.v39i17.34026

Issue

Section

AAAI Technical Track on Machine Learning III