1.
Hua H, Tang Y, Xu C, Luo J. V2Xum-LLM: Cross-Modal Video Summarization with Temporal Prompt Instruction Tuning. AAAI [Internet]. 2025Apr.11 [cited 2026May2];39(4):3599-607. Available from: https://ojs.aaai.org/index.php/AAAI/article/view/32374