Code-switching Mediated Sentence-level Semantic Learning

Authors

  • Shuai Zhang Department of Automation & BNRist, Tsinghua University
  • Jiangyan Yi Department of Automation & BNRist, Tsinghua University
  • Zhengqi Wen Department of Automation & BNRist, Tsinghua University
  • Jianhua Tao Department of Automation & BNRist, Tsinghua University
  • Feihu Che Department of Automation & BNRist, Tsinghua University
  • Jinyang Wu Department of Automation & BNRist, Tsinghua University
  • Ruibo Fu Institute of Automation, Chinese Academy of Sciences

DOI:

https://doi.org/10.1609/aaai.v39i24.34785

Abstract

Code-switching is a linguistic phenomenon in which different languages are used interactively during conversation. It poses significant performance challenges to natural language processing (NLP) tasks due to the often monolingual nature of the underlying system. We focus on sentence-level semantic associations between the different code-switching expressions. And we propose an innovative task-free semantic learning method based on the semantic property. Specifically, there are many different ways of languages switching for a sentence with the same meaning. We refine this into a semantic computational method by designing the loss of semantic invariant constraint during the model optimization. In this work, we conduct thorough experiments on speech recognition, speech translation, and language modeling tasks. The experimental results fully demonstrate that the proposed method can widely improve the performance of code-switching related tasks.

Downloads

Published

2025-04-11

How to Cite

Zhang, S., Yi, J., Wen, Z., Tao, J., Che, F., Wu, J., & Fu, R. (2025). Code-switching Mediated Sentence-level Semantic Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 39(24), 25913–25921. https://doi.org/10.1609/aaai.v39i24.34785

Issue

Section

AAAI Technical Track on Natural Language Processing III