Gu, Jiuxiang, Jianfei Cai, Gang Wang, and Tsuhan Chen. “Stack-Captioning: Coarse-to-Fine Learning for Image Captioning”. Proceedings of the AAAI Conference on Artificial Intelligence 32, no. 1 (April 27, 2018). Accessed April 23, 2024. https://ojs.aaai.org/index.php/AAAI/article/view/12266.