Chen, J., Y. Pan, Y. Li, T. Yao, H. Chao, and T. Mei. “Temporal Deformable Convolutional Encoder-Decoder Networks for Video Captioning”. Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, no. 01, July 2019, pp. 8167-74, doi:10.1609/aaai.v33i01.33018167.