[1]
Zeng, Y. et al. 2023. Acceleration of Large Transformer Model Training by Sensitivity-Based Layer Dropping. Proceedings of the AAAI Conference on Artificial Intelligence. 37, 9 (Jun. 2023), 11156–11163. DOI:https://doi.org/10.1609/aaai.v37i9.26321.