Yan, Rui, Mike Zheng Shou, Yixiao Ge, Jinpeng Wang, Xudong Lin, Guanyu Cai, and Jinhui Tang. “Video-Text Pre-Training With Learned Regions for Retrieval”. Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 3 (June 26, 2023): 3100-3108. Accessed September 18, 2024. https://ojs.aaai.org/index.php/AAAI/article/view/25414.