Consensus-Aligned Neuron Efficient Fine-Tuning Large Language Models for Multi-Domain Machine Translation

Shuting Jiang; Ran Song; Yuxin Huang; Yan Xiang; Yantuan Xian; Shengxiang Gao; Zhengtao Yu

doi:10.1609/aaai.v40i37.40394

Authors

Shuting Jiang Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, China Yunnan Key Laboratory of Artificial Intelligence, Kunming, China
Ran Song Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, China Yunnan Key Laboratory of Artificial Intelligence, Kunming, China
Yuxin Huang Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, China Yunnan Key Laboratory of Artificial Intelligence, Kunming, China
Yan Xiang Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, China Yunnan Key Laboratory of Artificial Intelligence, Kunming, China
Yantuan Xian Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, China Yunnan Key Laboratory of Artificial Intelligence, Kunming, China
Shengxiang Gao Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, China Yunnan Key Laboratory of Artificial Intelligence, Kunming, China
Zhengtao Yu Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, China Yunnan Key Laboratory of Artificial Intelligence, Kunming, China

DOI:

https://doi.org/10.1609/aaai.v40i37.40394

Abstract

Multi-domain machine translation (MDMT) aims to build a unified model capable of translating content across diverse domains. Despite the impressive machine translation capabilities demonstrated by large language models (LLMs), domain adaptation still remains a challenge for LLMs. Existing MDMT methods such as in-context learning and parameter-efficient fine-tuning often suffer from domain shift, parameter interference and limited generalization. In this work, we propose a neuron-efficient fine-tuning framework for MDMT that identifies and updates consensus-aligned neurons within LLMs. These neurons are selected by maximizing the mutual information between neuron behavior and domain features, enabling LLMs to capture both generalizable translation patterns and domain-specific nuances. Our method then fine-tunes LLMs guided by these neurons, effectively mitigating parameter interference and domain-specific overfitting. Comprehensive experiments on three LLMs across ten German-English and Chinese-English translation domains evidence that our method consistently outperforms strong PEFT baselines on both seen and unseen domains, achieving state-of-the-art performance.

Consensus-Aligned Neuron Efficient Fine-Tuning Large Language Models for Multi-Domain Machine Translation

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information