FP=XINT: Representing Neural Networks via Low-Bit Series Basis Functions

Boyang Zhang; Daning Cheng; Yunquan Zhang; Jiake Tian; Jing Li; Fangming Liu

doi:10.1609/aaai.v40i33.40043

Authors

Boyang Zhang Institute of Computing Technology, Chinese Academy of Sciences Pengcheng Laboratory University of Chinese Academy of Sciences
Daning Cheng Institute of Computing Technology, Chinese Academy of Sciences
Yunquan Zhang Institute of Computing Technology, Chinese Academy of Sciences
Jiake Tian Pengcheng Laboratory South China University of Technology
Jing Li Harbin Institute of Technology
Fangming Liu Pengcheng Laboratory

DOI:

https://doi.org/10.1609/aaai.v40i33.40043

Abstract

Deep neural networks are often over-parameterized, resulting in prohibitive storage and computational costs. A fundamental question is whether a complex network can be re-expressed in terms of a compact set of basis functions without sacrificing accuracy. Motivated by this perspective, we aim to approximate a dense model by decomposing it into a small number of lightweight components that capture the essential functional structure of the network. To this end, we propose a series expansion framework that rewrites a neural network as a linear combination of low-bit basis models. Within the post-training quantization setting, the full-precision model is expanded hierarchically at the tensor, layer, and model levels into a structured set of basis functions. We theoretically prove that this expansion converges exponentially to the original model. Furthermore, we design AbelianAdd and AbelianMul operations between isomorphic basis models, endowing the expansion with an Abelian group structure that naturally supports commutative and parallel computation. Experimental results across diverse architectures show that our series expansion method leverages a set of ultra-low-bit basis functions, not only preserving full-precision performance without the need for calibration data or fine-tuning, but also featuring a parallel-friendly design that enables efficient and scalable deployment.

FP=XINT: Representing Neural Networks via Low-Bit Series Basis Functions

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information