Du, C., Guo, Y., Shen, F., Liu, Z., Liang, Z., Chen, X., … Yu, K. (2024). UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding. Proceedings of the AAAI Conference on Artificial Intelligence, 38(16), 17924–17932. https://doi.org/10.1609/aaai.v38i16.29747