Large language model (LLM) shows promising performances in a variety of downstream tasks, such as machine translation (MT). However, using LLMs for translation suffers from high computational costs and significant latency. Based on our evaluation, in most cases, translations using LLMs are comparable to that generated by neural machine translation (NMT) systems. Only in particular scenarios, LLM and NMT models show respective advantages. As a result, integrating NMT and LLM for translation and using LLM only when necessary seems to be a sound solution. A scheduling policy that optimizes translation result while ensuring fast speed and as little LLM usage as possible is thereby required. We compare several scheduling policies and propose a novel and straightforward decider that leverages source sentence features. We conduct extensive experiments on multilingual test sets and the result shows that we can achieve optimal translation performance with minimal LLM usage, demonstrating effectiveness of our decider.
View on arXiv@article{wu2025_2505.13554, title={ Combining the Best of Both Worlds: A Method for Hybrid NMT and LLM Translation }, author={ Zhanglin Wu and Daimeng Wei and Xiaoyu Chen and Hengchao Shang and Jiaxin Guo and Zongyao Li and Yuanchang Luo and Jinlong Yang and Zhiqiang Rao and Hao Yang }, journal={arXiv preprint arXiv:2505.13554}, year={ 2025 } }