OmniScale: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo

Qianli Ma; Yaowei Zheng; Zhelun Shi; Zhongkai Zhao; Bin Jia; Ziyue Huang; Zhiqi Lin; Youjie Li; Jiacheng Yang; Yanghua Peng; Zhi Zhang; Xin Liu

doi:10.1609/aaai.v40i29.39607

Authors

Qianli Ma ByteDance Seed
Yaowei Zheng ByteDance Seed
Zhelun Shi ByteDance Seed
Zhongkai Zhao ByteDance Seed
Bin Jia ByteDance Seed
Ziyue Huang ByteDance Seed
Zhiqi Lin ByteDance Seed
Youjie Li ByteDance Seed
Jiacheng Yang ByteDance Seed
Yanghua Peng ByteDance Seed
Zhi Zhang ByteDance Seed
Xin Liu ByteDance Seed

DOI:

https://doi.org/10.1609/aaai.v40i29.39607

Abstract

Recent advances in large language models (LLMs) have driven impressive progress in omni-modal understanding and generation. However, training omni-modal LLMs remains a significant challenge due to the heterogeneous model architectures required to process diverse modalities, necessitating sophisticated system design for efficient large-scale training. Existing frameworks typically entangle model definition with parallel logic, incurring limited scalability and substantial engineering overhead for end-to-end omni-modal training. We present OmniScale, a modular and efficient training framework to accelerate the development of omni-modal LLMs. OmniScale introduces model-centric distributed recipes that decouples communication from computation, enabling efficient 3D parallelism on omni-modal LLMs. OmniScale also features a flexible configuration interface supporting seamless integration of new modalities with minimal code change. Using OmniScale, a omni-modal mixture-of-experts (MoE) model with 30B parameters can be trained with over 2,800 tokens/sec/GPU throughput and scale to 160K context lengths via 3D parallelism on 128 GPUs, showcasing its superior efficiency and scalability for training large omni-modal LLMs.

OmniScale: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information