Wu, H., & Chen, L. (2026). TawPipe: Topology-Aware Weight Pipeline Parallelism for Accelerating Long-Context Large Models Training. Proceedings of the AAAI Conference on Artificial Intelligence, 40(32), 26894–26902. https://doi.org/10.1609/aaai.v40i32.39901