[1]

C. Du, “UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding”, AAAI, vol. 38, no. 16, pp. 17924–17932, Mar. 2024.