Mitigating Catastrophic Forgetting in Target Language Adaptation of LLMs via Source-Shielded Updates
Atsuki Yamaguchi, Terufumi Morishita, Aline Villavicencio, Nikolaos Aletras
2025-12-05
Summary
This paper focuses on making large language models, which are AI systems that follow instructions, work well in many different languages, even ones where there isn't much training data available.
What's the problem?
Large language models are usually trained on huge amounts of data in a few popular languages like English. When you try to get them to work in other languages, it's expensive because you need a lot of labeled data in that new language. Also, when you teach the model a new language, it often 'forgets' what it already knew about the original language, which is a problem called 'catastrophic forgetting'.
What's the solution?
The researchers came up with a method called Source-Shielded Updates (SSU). Basically, they figured out which parts of the model are most important for understanding the original language. Then, when they teach the model a new language using only data in that new language, they 'freeze' those important parts to prevent the model from forgetting. They do this by carefully choosing which parameters to update during the learning process, protecting the core knowledge.
Why it matters?
This research is important because it makes it more practical to create language models that can be used globally. It means we don't need massive amounts of expensive labeled data for every language, and it helps ensure that the model doesn't lose its ability to understand the languages it already knows. This opens the door to more accessible and inclusive AI technology for people all over the world.
Abstract
Expanding the linguistic diversity of instruct large language models (LLMs) is crucial for global accessibility but is often hindered by the reliance on costly specialized target language labeled data and catastrophic forgetting during adaptation. We tackle this challenge under a realistic, low-resource constraint: adapting instruct LLMs using only unlabeled target language data. We introduce Source-Shielded Updates (SSU), a selective parameter update strategy that proactively preserves source knowledge. Using a small set of source data and a parameter importance scoring method, SSU identifies parameters critical to maintaining source abilities. It then applies a column-wise freezing strategy to protect these parameters before adaptation. Experiments across five typologically diverse languages and 7B and 13B models demonstrate that SSU successfully mitigates catastrophic forgetting. It reduces performance degradation on monolingual source tasks to just 3.4% (7B) and 2.8% (13B) on average, a stark contrast to the 20.3% and 22.3% from full fine-tuning. SSU also achieves target-language performance highly competitive with full fine-tuning, outperforming it on all benchmarks for 7B models and the majority for 13B models.