Chen, J., Pan, Y., Li, Y., Yao, T., Chao, H. and Mei, T. (2019) “Temporal Deformable Convolutional Encoder-Decoder Networks for Video Captioning”, Proceedings of the AAAI Conference on Artificial Intelligence, 33(01), pp. 8167-8174. doi: 10.1609/aaai.v33i01.33018167.