Feng, Z., Yang, S., Duan, B., Yang, W., & Wang, J. (2026). EM-KD: Distilling Efficient Multimodal Large Language Model with Unbalanced Vision Tokens. Proceedings of the AAAI Conference on Artificial Intelligence, 40(25), 21111–21119. https://doi.org/10.1609/aaai.v40i25.39254