Jang, J., Kong, C., Jeon, D., Kim, S. and Kwak, N. (2023) “Unifying Vision-Language Representation Space with Single-Tower Transformer”, Proceedings of the AAAI Conference on Artificial Intelligence, 37(1), pp. 980-988. doi: 10.1609/aaai.v37i1.25178.