Li, D., Li, R., Wang, L., Wang, Y., Qi, J., Zhang, L., Liu, T., Xu, Q., & Lu, H. (2022). You Only Infer Once: Cross-Modal Meta-Transfer for Referring Video Object Segmentation. Proceedings of the AAAI Conference on Artificial Intelligence, 36(2), 1297-1305. https://doi.org/10.1609/aaai.v36i2.20017