[1]
B. Wang, J. Li, H. Chen, Y. Chu, Y. Fan, and X. Hu, “Deconstructing Pre-training: Knowledge Attribution Analysis in MoE and Dense Models”, AAAI, vol. 40, no. 39, pp. 33359–33367, Mar. 2026.