Verma, D., Haldar, A., & Dutta, T. (2023). Leveraging Weighted Cross-Graph Attention for Visual and Semantic Enhanced Video Captioning Network. Proceedings of the AAAI Conference on Artificial Intelligence, 37(2), 2465–2473. https://doi.org/10.1609/aaai.v37i2.25343