Nguyen, K., Biten, A. F., Mafla, A., Gomez, L., & Karatzas, D. (2023). Show, Interpret and Tell: Entity-Aware Contextualised Image Captioning in Wikipedia. Proceedings of the AAAI Conference on Artificial Intelligence, 37(2), 1940-1948. https://doi.org/10.1609/aaai.v37i2.25285