[1]
Yan, S., Zhang, R., Guo, Z., Chen, W., Zhang, W., Li, H., Qiao, Y., Dong, H., He, Z. and Gao, P. 2024. Referred by Multi-Modality: A Unified Temporal Transformer for Video Object Segmentation. Proceedings of the AAAI Conference on Artificial Intelligence. 38, 6 (Mar. 2024), 6449-6457. DOI:https://doi.org/10.1609/aaai.v38i6.28465.