Gao, Shengyi, Zhe Chen, Guo Chen, Wenhai Wang, and Tong Lu. 2024. “AVSegFormer: Audio-Visual Segmentation With Transformer”. Proceedings of the AAAI Conference on Artificial Intelligence 38 (11):12155-63. https://doi.org/10.1609/aaai.v38i11.29104.