D2 Prune: Sparsifying Large Language Models via Dual Taylor Expansion and Attention Distribution Awareness

Lang Xiong; Ning Liu; Ao Ren; Yuheng Bai; Haining Fang; Binyan Zhang; Zhe Jiang; Yujuan Tan; Duo Liu

doi:10.1609/aaai.v40i32.39932

Authors

Lang Xiong Chongqing University
Ning Liu Beijing Innovation Center of Humanoid Robotics
Ao Ren Chongqing University
Yuheng Bai Chongqing University
Haining Fang Chongqing University
Binyan Zhang Chongqing University
Zhe Jiang Chongqing University
Yujuan Tan National University of Defense Technology
Duo Liu Chongqing University

DOI:

https://doi.org/10.1609/aaai.v40i32.39932

Abstract

Large language models (LLMs) face significant deployment challenges due to their massive computational demands. While pruning offers a promising compression solution, existing methods suffer from two critical limitations: (1) They neglect activation distribution shifts between calibration data and test data, resulting in inaccurate error estimations; (2) Overlooking the long-tail distribution characteristics of activations in the attention module. To address these limitations, this paper proposes a novel pruning method, D²Prune. First, we propose a dual Taylor expansion-based method that jointly models weight and activation perturbations for precise error estimation, leading to precise pruning mask selection and weight updating and facilitating error minimization during pruning. Second, we propose an attention-aware dynamic update strategy that preserves the long-tail attention pattern by jointly minimizing the KL divergence of attention distributions and the reconstruction error. Extensive experiments show that D²Prune consistently outperforms SOTA methods across various LLMs (e.g., OPT-125M, LLaMA2/3, Qwen3). Moreover, the dynamic attention update mechanism also generalizes well to ViT-based vision models like DeiT, achieving superior accuracy on ImageNet-1K.

D2 Prune: Sparsifying Large Language Models via Dual Taylor Expansion and Attention Distribution Awareness

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information