[1]
Y. Xu, “SECap: Speech Emotion Captioning with Large Language Model”, AAAI, vol. 38, no. 17, pp. 19323–19331, Mar. 2024.