Fast and Slow Gradient Approximation for Binary Neural Network Optimization

Authors

  • Xinquan Chen Harbin Institute of Technology
  • Junqi Gao Harbin Institute of Technology Shanghai Artificial Intelligence Laboratory
  • Biqing Qi Shanghai Artificial Intelligence Laboratory
  • Dong Li Harbin Institute of Technology
  • Yiang Luo Harbin Institute of Technology
  • Fangyuan Li Harbin Institute of Technology
  • Pengfei Li Harbin Institute of Technology

DOI:

https://doi.org/10.1609/aaai.v39i25.34896

Abstract

Binary Neural Networks (BNNs) have garnered significant attention due to their immense potential for deployment on edge devices. However, the non-differentiability of the quantization function poses a challenge for the optimization of BNNs, as its derivative cannot be backpropagated. To address this issue, hypernetwork based methods, which utilize neural networks to learn the gradients of non-differentiable quantization functions, have emerged as a promising approach due to their adaptive learning capabilities to reduce estimation errors. However, existing hypernetwork based methods typically rely solely on current gradient information, neglecting the influence of historical gradients. This oversight can lead to accumulated gradient errors when calculating gradient momentum during optimization. To incorporate historical gradient information, we design a Historical Gradient Storage (HGS) module, which models the historical gradient sequence to generate the first-order momentum required for optimization. To further enhance gradient generation in hypernetworks, we propose a Fast and Slow Gradient Generation (FSG) method. Additionally, to produce more precise gradients, we introduce Layer Recognition Embeddings (LRE) into the hypernetwork, facilitating the generation of layer-specific fine gradients. Extensive comparative experiments on the CIFAR-10 and CIFAR-100 datasets demonstrate that our method achieves faster convergence and lower loss values, outperforming existing baselines.

Downloads

Published

2025-04-11

How to Cite

Chen, X., Gao, J., Qi, B., Li, D., Luo, Y., Li, F., & Li, P. (2025). Fast and Slow Gradient Approximation for Binary Neural Network Optimization. Proceedings of the AAAI Conference on Artificial Intelligence, 39(25), 26913–26921. https://doi.org/10.1609/aaai.v39i25.34896

Issue

Section

AAAI Technical Track on Search and Optimization