WANG, Y.; XU, J.; SUN, Y. End-to-End Transformer Based Model for Image Captioning. Proceedings of the AAAI Conference on Artificial Intelligence, [S. l.], v. 36, n. 3, p. 2585-2594, 2022. DOI: 10.1609/aaai.v36i3.20160. Disponível em: https://ojs.aaai.org/index.php/AAAI/article/view/20160. Acesso em: 1 may. 2026.