(1)

Wang, Y.; Xu, J.; Sun, Y. End-to-End Transformer Based Model for Image Captioning. AAAI 2022, 36, 2585-2594.