[1]
W. Hao, Z. Zhang, and H. Guan, “Integrating Both Visual and Audio Cues for Enhanced Video Caption”, AAAI, vol. 32, no. 1, Apr. 2018.