Xu, Y., Chen, H., Yu, J., Huang, Q., Wu, Z., Zhang, S.-X., … Gu, R. (2024). SECap: Speech Emotion Captioning with Large Language Model. Proceedings of the AAAI Conference on Artificial Intelligence, 38(17), 19323–19331. https://doi.org/10.1609/aaai.v38i17.29902