(1)
Wu, H.; Chen, L. TawPipe: Topology-Aware Weight Pipeline Parallelism for Accelerating Long-Context Large Models Training. AAAI 2026, 40, 26894-26902.