AQ-DETR: Low-Bit Quantized Detection Transformer with Auxiliary Queries

Runqi Wang; Huixin Sun; Linlin Yang; Shaohui Lin; Chuanjian Liu; Yan Gao; Yao Hu; Baochang Zhang

doi:10.1609/aaai.v38i14.29487

Authors

Runqi Wang ASEE, EIE and Hangzhou Research Institute, Beihang University
Huixin Sun ASEE, EIE and Hangzhou Research Institute, Beihang University
Linlin Yang State Key Laboratory of Media Convergence and Communication, Communication University of China
Shaohui Lin School of Computer Science and Technology, East China Normal University
Chuanjian Liu Huawei Noah's Ark Lab
Yan Gao Xiaohongshu Inc
Yao Hu Xiaohongshu Inc
Baochang Zhang ASEE, EIE and Hangzhou Research Institute, Beihang University Zhongguancun Laboratory Nanchang Institute of Technology

DOI:

https://doi.org/10.1609/aaai.v38i14.29487

Keywords:

ML: Learning on the Edge & Model Compression, CV: Object Detection & Categorization

Abstract

DEtection TRansformer (DETR)-based models have achieved remarkable performance. However, they are accompanied by a large computation overhead cost, which significantly prevents their applications on resource-limited devices. Prior arts attempt to reduce the computational burden of DETR using low-bit quantization, while these methods sacrifice a severe significant performance on weight-activation-attention low-bit quantization. We observe that the number of matching queries and positive samples affect much on the representation capacity of queries in DETR, while quantifying queries of DETR further reduces its representational capacity, thus leading to a severe performance drop. We introduce a new quantization strategy based on Auxiliary Queries for DETR (AQ-DETR), aiming to enhance the capacity of quantized queries. In addition, a layer-by-layer distillation is proposed to reduce the quantization error between quantized attention and full-precision counterpart. Through our extensive experiments on large-scale open datasets, the performance of the 4-bit quantization of DETR and Deformable DETR models is comparable to full-precision counterparts.

AQ-DETR: Low-Bit Quantized Detection Transformer with Auxiliary Queries

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Subscription