Controlling Language Confusion in Multilingual LLMs

25 May 2025

Author Contacts:

naa012@cau.ac.kr

ArXiv (abs)PDF HTML

Main:3 Pages

7 Figures

Bibliography:2 Pages

5 Tables

Appendix:4 Pages

Abstract

Large language models often suffer from language confusion, a phenomenon where responses are partially or entirely generated in unintended languages. This can critically impact user experience in low-resource settings. We hypothesize that conventional supervised fine-tuning exacerbates this issue because the softmax objective focuses probability mass only on the single correct token but does not explicitly penalize cross-lingual mixing. Interestingly, by observing loss trajectories during the pretraining phase, we observe that models fail to learn to distinguish between monolingual and language-confused text. Additionally, we find that ORPO, which adds penalties for unwanted output styles to standard SFT, effectively suppresses language-confused generations even at high decoding temperatures without degrading overall model performance. Our findings suggest that incorporating appropriate penalty terms can mitigate language confusion in low-resource settings with limited data.

View on arXiv

@article{lee2025_2505.19116,
  title={ Controlling Language Confusion in Multilingual LLMs },
  author={ Nahyun Lee and Yeongseo Woo and Hyunwoo Ko and Guijin Son },
  journal={arXiv preprint arXiv:2505.19116},
  year={ 2025 }
}

Comments on this paper