Li, Dezhuang, Ruoqi Li, Lijun Wang, Yifan Wang, Jinqing Qi, Lu Zhang, Ting Liu, Qingquan Xu, and Huchuan Lu. “You Only Infer Once: Cross-Modal Meta-Transfer for Referring Video Object Segmentation”. Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 2 (June 28, 2022): 1297-1305. Accessed April 19, 2024. https://ojs.aaai.org/index.php/AAAI/article/view/20017.