Fang, K., L. Zhou, C. Jin, Y. Zhang, K. Weng, T. Zhang, and W. Fan. “Fully Convolutional Video Captioning With Coarse-to-Fine and Inherited Attention”. Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, no. 01, July 2019, pp. 8271-8, doi:10.1609/aaai.v33i01.33018271.