[1]

Jang, J., Kong, C., Jeon, D., Kim, S. and Kwak, N. 2023. Unifying Vision-Language Representation Space with Single-Tower Transformer. Proceedings of the AAAI Conference on Artificial Intelligence. 37, 1 (Jun. 2023), 980-988. DOI:https://doi.org/10.1609/aaai.v37i1.25178.