Fang, K., Zhou, L., Jin, C., Zhang, Y., Weng, K., Zhang, T., & Fan, W. (2019). Fully Convolutional Video Captioning with Coarse-to-Fine and Inherited Attention. Proceedings of the AAAI Conference on Artificial Intelligence, 33(01), 8271-8278. https://doi.org/10.1609/aaai.v33i01.33018271