Salin, E., Farah, B., Ayache, S. and Favre, B. (2022) “Are Vision-Language Transformers Learning Multimodal Representations? A Probing Perspective”, Proceedings of the AAAI Conference on Artificial Intelligence, 36(10), pp. 11248-11257. doi: 10.1609/aaai.v36i10.21375.