Fan, J., & Chen, C.-M. (2026). Efficient Multimodal Large Language Model via Dynamic KV Cache Quantization. Proceedings of the AAAI Conference on Artificial Intelligence, 40(25), 20994–21001. https://doi.org/10.1609/aaai.v40i25.39241