On the Impact of Weight Quantization on Deep Neural Network Uncertainty

Shuang Liang; Xun Lu; Zi-Ang Liu; Ming-Liang Wang; Yan Lyu; Shao-Qun Zhang

doi:10.1609/aaai.v40i28.39513

Authors

Shuang Liang National Key Laboratory for Novel Software Technology, Nanjing University, China School of Intelligent Science and Technology, Nanjing University, China
Xun Lu School of Intelligent Science and Technology, Nanjing University, China
Zi-Ang Liu China Mobile Zijin Innovation Institute, China
Ming-Liang Wang China Mobile Zijin Innovation Institute, China
Yan Lyu China Mobile Zijin Innovation Institute, China
Shao-Qun Zhang National Key Laboratory for Novel Software Technology, Nanjing University, China School of Intelligent Science and Technology, Nanjing University, China

DOI:

https://doi.org/10.1609/aaai.v40i28.39513

Abstract

Weight Quantization (WQ) is a key technique for lightweight Deep Neural Network (DNN) computations. While existing algorithms often pursue memory compression and inference acceleration with accuracy comparable to full-precision models, the effect of WQ on DNN uncertainty remains largely unexplored. In this paper, we quantify the impact of WQ on DNN uncertainty through the novel Exact Moment Propagation (EMP) uncertainty estimator. It is observed that WQ significantly increases DNN uncertainty. Based on the EMP estimator, we propose the MOMent Alignment (MOMA) to reduce WQ-induced uncertainty and preserve the accuracy of weight-quantized DNNs. Empirical results across various DNN architectures and datasets validate the effectiveness of both EMP and MOMA methods.

On the Impact of Weight Quantization on Deep Neural Network Uncertainty

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information