Fang, X. (2026) “Rethinking Video-Language Model from the Language Input Perspective”, Proceedings of the AAAI Conference on Artificial Intelligence, 40(5), pp. 3885–3893. doi: 10.1609/aaai.v40i5.37390.