(1)

Gao, S.; Chen, Z.; Chen, G.; Wang, W.; Lu, T. AVSegFormer: Audio-Visual Segmentation With Transformer. AAAI 2024, 38, 12155-12163.