1.
Wang B, Li J, Chen H, Chu Y, Fan Y, Hu X. Deconstructing Pre-training: Knowledge Attribution Analysis in MoE and Dense Models. AAAI [Internet]. 2026 Mar. 14 [cited 2026 May 16];40(39):33359-67. Available from: https://ojs.aaai.org/index.php/AAAI/article/view/40622