[1]

H. Ryu, S. Kang, H. Kang, and C. D. Yoo, “Semantic Grouping Network for Video Captioning”, AAAI, vol. 35, no. 3, pp. 2514-2522, May 2021.