Li, M., Shi, X., Leng, H., Zhou, W., Zheng, H.-T., & Zhang, K. (2023). Learning Semantic Alignment with Global Modality Reconstruction for Video-Language Pre-training towards Retrieval. Proceedings of the AAAI Conference on Artificial Intelligence, 37(1), 1377-1385. https://doi.org/10.1609/aaai.v37i1.25222