DQT: Dynamic Quantization Training via Dequantization-Free Nested Integer Arithmetic

Authors

  • Hazem Hesham Yousef Shalby Department of Electronics, Information and Bioengineering, Politecnico di Milano
  • Fabrizio Pittorino Department of Electronics, Information and Bioengineering, Politecnico di Milano
  • Francesca Palermo EssilorLuxottica Smart Eyewear Lab
  • Diana Trojaniello EssilorLuxottica Smart Eyewear Lab
  • Manuel Roveri Department of Electronics, Information and Bioengineering, Politecnico di Milano

DOI:

https://doi.org/10.1609/aaai.v40i30.39717

Abstract

The deployment of deep neural networks on resource-constrained devices relies on quantization. While static, uniform quantization applies a fixed bit-width to all inputs, it fails to adapt to their varying complexity. Dynamic, instance-based mixed-precision quantization promises a superior accuracy-efficiency trade-off by allocating higher precision only when needed. However, a critical bottleneck remains: existing methods require a costly dequantize-to-float and requantize-to-integer cycle to change precision, breaking the integer-only hardware paradigm and compromising performance gains. This paper introduces Dynamic Quantization Training (DQT), a novel framework that removes this bottleneck. At the core of DQT is a nested integer representation where lower-precision values are bit-wise embedded within higher-precision ones. This design, coupled with custom integer-only arithmetic, allows for on-the-fly bit-width switching through a near-zero-cost bit-shift operation. This makes DQT the first quantization framework to enable both dequantization-free static mixed-precision of the backbone network, and truly efficient dynamic, instance-based quantization through a lightweight controller that decides at runtime how to quantize each layer. We demonstrate DQT state-of-the-art performance on ResNet18 on CIFAR-10 and ResNet50 on ImageNet. On ImageNet, our 4-bit dynamic ResNet50 achieves 77.00% top-1 accuracy, an improvement over leading static (LSQ, 76.70%) and dynamic (DQNET, 76.94%) methods at a comparable BitOPs budget. Crucially, DQT achieves this with a bit-width transition cost of only 28.3M simple bit-shift operations, a drastic improvement over the 56.6M costly Multiply-Accumulate (MAC) floating-point operations required by previous dynamic approaches - unlocking a new frontier in efficient, adaptive AI.

Downloads

Published

2026-03-14

How to Cite

Shalby, H. H. Y., Pittorino, F., Palermo, F., Trojaniello, D., & Roveri, M. (2026). DQT: Dynamic Quantization Training via Dequantization-Free Nested Integer Arithmetic. Proceedings of the AAAI Conference on Artificial Intelligence, 40(30), 25252–25259. https://doi.org/10.1609/aaai.v40i30.39717

Issue

Section

AAAI Technical Track on Machine Learning VII