Fu, Tianhao, et al. “Two Heads Are Better Than One: Distilling Large Language Model Features into Small Models With Feature Decomposition and Mixture”. Proceedings of the AAAI Conference on Artificial Intelligence, vol. 40, no. 23, Mar. 2026, pp. 19082-90, doi:10.1609/aaai.v40i23.38981.