Fu, Tianhao, Xinxin Xu, Weichen Xu, Jue Chen, Ruilong Ren, Bowen Deng, Xinyu Zhao, Jian Cao, and Xixin Cao. 2026. “Two Heads Are Better Than One: Distilling Large Language Model Features into Small Models With Feature Decomposition and Mixture”. Proceedings of the AAAI Conference on Artificial Intelligence 40 (23):19082-90. https://doi.org/10.1609/aaai.v40i23.38981.