(1)
Wang, B.; Li, J.; Chen, H.; Chu, Y.; Fan, Y.; Hu, X. Deconstructing Pre-Training: Knowledge Attribution Analysis in MoE and Dense Models. AAAI 2026, 40, 33359-33367.