Chen, Yizhen, et al. “Tagging before Alignment: Integrating Multi-Modal Tags for Video-Text Retrieval”. Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 1, June 2023, pp. 396-04, doi:10.1609/aaai.v37i1.25113.