Yang, Bang, et al. “Non-Autoregressive Coarse-to-Fine Video Captioning”. Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 4, May 2021, pp. 3119-27, doi:10.1609/aaai.v35i4.16421.