Zeng, Yujie, et al. “Acceleration of Large Transformer Model Training by Sensitivity-Based Layer Dropping”. Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 9, June 2023, pp. 11156-63, doi:10.1609/aaai.v37i9.26321.