Fine-Tuning Language Models with Collaborative and Semantic Experts
DOI:
https://doi.org/10.1609/aaai.v39i24.34753Abstract
Recent advancements in large language models (LLMs) have broadened their application scope but revealed challenges in balancing capabilities across general knowledge, coding, and mathematics. To address this, we introduce a Collaborative and Semantic Experts (CoE) approach for supervised fine-tuning (SFT), which employs a two-phase training strategy. Initially, expert training fine-tunes the feed-forward network on specialized datasets, developing distinct experts in targeted domains. Subsequently, expert leveraging synthesizes these trained experts into a structured model with semantic guidance to activate specific experts, enhancing performance and interpretability. Evaluations on comprehensive benchmarks across MMLU, HumanEval, GSM8K, MT-Bench, and AlpacaEval confirm CoE's efficacy, demonstrating improved performance and expert collaboration in diverse tasks, significantly outperforming traditional SFT methods.Downloads
Published
2025-04-11
How to Cite
Yang, J., Hui, B., Yang, M., Yang, J., Zhang, L., Qu, Q., & Lin, J. (2025). Fine-Tuning Language Models with Collaborative and Semantic Experts. Proceedings of the AAAI Conference on Artificial Intelligence, 39(24), 25624–25632. https://doi.org/10.1609/aaai.v39i24.34753
Issue
Section
AAAI Technical Track on Natural Language Processing III