Xu, S., Pang, L., Zhu, Y., Gu, J., Wei, Z., Deng, J., … Cheng, X. (2026). RLKD: Distilling LLMs’ Reasoning via Reinforcement Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 40(40), 34151–34159. https://doi.org/10.1609/aaai.v40i40.40710