StyleProto: Style-Augmented Prototype Learning for Cross-Domain Few-Shot Object Detection
DOI:
https://doi.org/10.1609/aaai.v40i14.38160Abstract
Cross-Domain Few-Shot Object Detection (CD-FSOD) faces significant challenges due to the dual issues of domain shift and limited labeled samples. One major challenge is style bias, caused by limited support samples that fail to represent the target domain’s style diversity. Another is feature confusion, which stems from distribution shifts and limited supervision, manifesting as both object-background ambiguity and object-object confusion. To address these challenges, we propose Style-Augmented Prototype Learning (StyleProto), which constructs style-aware prototypes from support samples with diverse visual styles, and refines them via spatial weighting and discriminative fusion. Specifically, our StyleProto consists of three components: (1) Style Generation Augmentation (SGA); (2) Semantic-Focused Prototype Construction (SPC); (3) Hierarchical Prototype Fusion Aggregator (HPFA). SGA synthesizes style-diverse yet semantically consistent training samples by recombining style statistics from the support set, thus improving robustness to unseen styles. SPC aggregates support features using spatial attention to highlight object semantics and suppress background noise, yielding cleaner and more distinctive class prototypes. HPFA leverages query-guided attention to integrate discriminative support features, enhancing prototype representations with richer class-specific details. Extensive experiments on multiple benchmarks demonstrate that StyleProto consistently outperforms existing state-of-the-art methods.Published
2026-03-14
How to Cite
Yang, X., & Xie, Q. (2026). StyleProto: Style-Augmented Prototype Learning for Cross-Domain Few-Shot Object Detection. Proceedings of the AAAI Conference on Artificial Intelligence, 40(14), 11748–11756. https://doi.org/10.1609/aaai.v40i14.38160
Issue
Section
AAAI Technical Track on Computer Vision XI