Black, A., Shi, J., Fan, Y., Bui, T., & Collomosse, J. (2024). VIXEN: Visual Text Comparison Network for Image Difference Captioning. Proceedings of the AAAI Conference on Artificial Intelligence, 38(2), 846-854. https://doi.org/10.1609/aaai.v38i2.27843