Guo, Pinxue, Hao Huang, Peiyang He, Xuefeng Liu, Tianjun Xiao, and Wenqiang Zhang. “OpenVIS: Open-Vocabulary Video Instance Segmentation”. Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 3 (April 11, 2025): 3275–3283. Accessed May 13, 2026. https://ojs.aaai.org/index.php/AAAI/article/view/32338.