(1)
Ruan, L.; Hu, A.; Song, Y.; Zhang, L.; Zheng, S.; Jin, Q. Accommodating Audio Modality in CLIP for Multimodal Processing. AAAI 2023, 37, 9641-9649.