(1)
Zeng, Y.; He, W.; Vasyltsov, I.; Pang, J.; Chen, L. Acceleration of Large Transformer Model Training by Sensitivity-Based Layer Dropping. AAAI 2023, 37, 11156-11163.