Tower+: Bridging Generality and Translation Specialization in Multilingual LLMs
Ricardo Rei, Nuno M. Guerreiro, José Pombal, João Alves, Pedro Teixeirinha, Amin Farajian, André F. T. Martins
2025-07-01
Summary
This paper talks about Tower+, a group of language models designed to handle multiple languages well, especially for translating and other text tasks. Tower+ uses a new way of training that helps it get better at both general language tasks and translation.
What's the problem?
The problem is that language models usually have to choose between being good at general language understanding or being specialized in translation. It’s hard to find a model that does both really well at the same time.
What's the solution?
The researchers created Tower+, which uses several training steps like continued learning, supervised fine-tuning, optimizing for user preferences, and reinforcement learning. This training approach helps the models improve their skills in both general language tasks and translation without sacrificing one for the other.
Why it matters?
This matters because having a language model that is strong in many languages and good at translation makes communication easier across different cultures and languages, helping users get better and more accurate results regardless of the language they use.
Abstract
Tower+, a suite of fine-tuned language models, achieves strong performance in both translation and multilingual general-purpose text tasks through a novel training recipe that includes continued pretraining, supervised fine-tuning, preference optimization, and reinforcement learning.