[1]
X. Shen, “Agile-Quant: Activation-Guided Quantization for Faster Inference of LLMs on the Edge”, AAAI, vol. 38, no. 17, pp. 18944-18951, Mar. 2024.