FuxiMT: Sparsifying Large Language Models for Chinese-Centric Multilingual Machine Translation

20 May 2025

Main:6 Pages

3 Figures

Bibliography:3 Pages

10 Tables

Appendix:7 Pages

Abstract

In this paper, we present FuxiMT, a novel Chinese-centric multilingual machine translation model powered by a sparsified large language model (LLM). We adopt a two-stage strategy to train FuxiMT. We first pre-train the model on a massive Chinese corpus and then conduct multilingual fine-tuning on a large parallel dataset encompassing 65 languages. FuxiMT incorporates Mixture-of-Experts (MoEs) and employs a curriculum learning strategy for robust performance across various resource levels. Experimental results demonstrate that FuxiMT significantly outperforms strong baselines, including state-of-the-art LLMs and machine translation models, particularly under low-resource scenarios. Furthermore, FuxiMT exhibits remarkable zero-shot translation capabilities for unseen language pairs, indicating its potential to bridge communication gaps where parallel data are scarce or unavailable.

View on arXiv

@article{zhu2025_2505.14256,
  title={ FuxiMT: Sparsifying Large Language Models for Chinese-Centric Multilingual Machine Translation },
  author={ Shaolin Zhu and Tianyu Dong and Bo Li and Deyi Xiong },
  journal={arXiv preprint arXiv:2505.14256},
  year={ 2025 }
}

Comments on this paper