Wang, C., Zhou, H., Hu, Y., Huo, Y., Li, B., Liu, T., … Zhu, J. (2024). ESRL: Efficient Sampling-Based Reinforcement Learning for Sequence Generation. Proceedings of the AAAI Conference on Artificial Intelligence, 38(17), 19107–19115. https://doi.org/10.1609/aaai.v38i17.29878