TawPipe: Topology-Aware Weight Pipeline Parallelism for Accelerating Long-Context Large Models Training

Houming Wu; Ling Chen

doi:10.1609/aaai.v40i32.39901

Authors

Houming Wu State Key Laboratory of Blockchain and Data Security, Zhejiang University College of Computer Science and Technology, Zhejiang University
Ling Chen State Key Laboratory of Blockchain and Data Security, Zhejiang University College of Computer Science and Technology, Zhejiang University

DOI:

https://doi.org/10.1609/aaai.v40i32.39901

Abstract

Training large language models (LLMs) is fundamentally constrained by limited device memory and costly inter-device communication. Although pipeline parallelism alleviates memory pressure by partitioning models across devices, it incurs activation communication overhead that scales linearly with sequence length, limiting efficiency in long-context training. Recent weight-passing approaches (e.g., WeiPipe) mitigate this by transmitting model weights instead of activations, but suffer from redundant peer-to-peer (P2P) transfers and underutilized intra-node bandwidth. We propose TawPipe—topology-aware weight pipeline parallelism, which exploits hierarchical bandwidth in distributed clusters for improved communication efficiency. TawPipe: (i) groups devices based on topology to optimize intra-node collective and inter-node P2P communication; (ii) assigns each device a fixed shard of model weights and gradients, avoiding redundant transfers; and (iii) overlaps communication with computation to hide latency. Unlike global collective operations used in fully sharded data parallelism (FSDP), TawPipe confines most communication within node boundaries, significantly reducing cross-node traffic. Extensive experiments on up to 24 GPUs with LLaMA‑style models show that TawPipe achieves superior throughput and scalability compared to state-of-the-art baselines.

TawPipe: Topology-Aware Weight Pipeline Parallelism for Accelerating Long-Context Large Models Training

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information