[1]
X. Huang, Y.-L. Huang, and Z. Wen, “SoLA: Leveraging Soft Activation Sparsity and Low-Rank Decomposition for Large Language Model Compression”, AAAI, vol. 39, no. 16, pp. 17494–17502, Apr. 2025.