Zhuang, J. (2026) “Q Cache: Visual Attention Is Valuable in Less than Half of Decode Layers for Multimodal Large Language Model”, Proceedings of the AAAI Conference on Artificial Intelligence, 40(16), pp. 14031–14039. doi: 10.1609/aaai.v40i16.38414.