Fu, T. (2026) “Two Heads Are Better than One: Distilling Large Language Model Features into Small Models with Feature Decomposition and Mixture”, Proceedings of the AAAI Conference on Artificial Intelligence, 40(23), pp. 19082–19090. doi: 10.1609/aaai.v40i23.38981.