[1]
D. Verma, A. Haldar, and T. Dutta, “Leveraging Weighted Cross-Graph Attention for Visual and Semantic Enhanced Video Captioning Network”, AAAI, vol. 37, no. 2, pp. 2465-2473, Jun. 2023.