A Fixed-Point Approach to Unified Prompt-Based Counting
DOI:
https://doi.org/10.1609/aaai.v38i4.28134Keywords:
CV: Scene Analysis & Understanding, CV: Language and Vision, CV: ApplicationsAbstract
Existing class-agnostic counting models typically rely on a single type of prompt, e.g., box annotations. This paper aims to establish a comprehensive prompt-based counting framework capable of generating density maps for concerned objects indicated by various prompt types, such as box, point, and text. To achieve this goal, we begin by converting prompts from different modalities into prompt masks without requiring training. These masks are then integrated into a class-agnostic counting methodology for predicting density maps. Furthermore, we introduce a fixed-point inference along with an associated loss function to improve counting accuracy, all without introducing new parameters. The effectiveness of this method is substantiated both theoretically and experimentally. Additionally, a contrastive training scheme is implemented to mitigate dataset bias inherent in current class-agnostic counting datasets, a strategy whose effectiveness is confirmed by our ablation study. Our model excels in prominent class-agnostic datasets and exhibits superior performance in cross-dataset adaptation tasks.Downloads
Published
2024-03-24
How to Cite
Lin, W., & Chan, A. B. (2024). A Fixed-Point Approach to Unified Prompt-Based Counting. Proceedings of the AAAI Conference on Artificial Intelligence, 38(4), 3468-3476. https://doi.org/10.1609/aaai.v38i4.28134
Issue
Section
AAAI Technical Track on Computer Vision III