10
0

Tower+: Bridging Generality and Translation Specialization in Multilingual LLMs

Main:5 Pages
21 Figures
Bibliography:1 Pages
5 Tables
Appendix:24 Pages
Abstract

Fine-tuning pretrained LLMs has been shown to be an effective strategy for reaching state-of-the-art performance on specific tasks like machine translation. However, this process of adaptation often implies sacrificing general-purpose capabilities, such as conversational reasoning and instruction-following, hampering the utility of the system in real-world applications that require a mixture of skills. In this paper, we introduce Tower+, a suite of models designed to deliver strong performance across both translation and multilingual general-purpose text capabilities. We achieve a Pareto frontier between translation specialization and multilingual general-purpose capabilities by introducing a novel training recipe that builds on Tower (Alves et al., 2024), comprising continued pretraining, supervised fine-tuning, preference optimization, and reinforcement learning with verifiable rewards. At each stage of training, we carefully generate and curate data to strengthen performance on translation as well as general-purpose tasks involving code generation, mathematics problem solving, and general instruction-following. We develop models at multiple scales: 2B, 9B, and 72B. Our smaller models often outperform larger general-purpose open-weight and proprietary LLMs (e.g., Llama 3.3 70B, GPT-4o). Our largest model delivers best-in-class translation performance for high-resource languages and top results in multilingual Arena Hard evaluations and in IF-MT, a benchmark we introduce for evaluating both translation and instruction-following. Our findings highlight that it is possible to rival frontier models in general capabilities, while optimizing for specific business domains, such as translation and localization.

View on arXiv
@article{rei2025_2506.17080,
  title={ Tower+: Bridging Generality and Translation Specialization in Multilingual LLMs },
  author={ Ricardo Rei and Nuno M. Guerreiro and José Pombal and João Alves and Pedro Teixeirinha and Amin Farajian and André F. T. Martins },
  journal={arXiv preprint arXiv:2506.17080},
  year={ 2025 }
}
Comments on this paper