Wang, G., & Sun, P. (2026). Speech Recognition Model Improves Text-to-Speech Synthesis Using Fine-Grained Reward. Proceedings of the AAAI Conference on Artificial Intelligence, 40(39), 33440–33448. https://doi.org/10.1609/aaai.v40i39.40631