Control LLM: Controlled Evolution for Intelligence Retention in LLM

Haichao Wei, Yunxiang Ren, Zhoutong Fu, Aman Lunia, Yi-Lin Chen, Alice Leung, Ya Xu

2025-01-24

Control LLM: Controlled Evolution for Intelligence Retention in LLM

Summary

This paper talks about Control LLM, a new method to make large language models (LLMs) smarter without having to completely retrain them. It's like teaching an already smart AI new tricks without making it forget what it already knows.

What's the problem?

When we try to teach LLMs new things, they often forget what they've already learned. This is called catastrophic forgetting. It's like trying to teach a student new math without them forgetting how to read. This problem makes it hard to improve LLMs without starting all over again, which takes a lot of time and computer power.

What's the solution?

The researchers created Control LLM, which uses a clever trick. It's like giving the AI two brains - one that knows the old stuff and one that learns new things. These two 'brains' work together, sharing information so that the AI can learn new things while remembering the old. They tested this on different types of tasks, like math problems, coding, and understanding multiple languages.

Why it matters?

This matters because it could make AI systems much more flexible and efficient. Instead of having to create entirely new AIs every time we want them to learn something new, we could just update them. This could lead to smarter AI assistants that can handle a wider range of tasks without losing their existing skills. It's also important for companies using AI, as it could save them time and money while making their AI products more capable.

Abstract

Large Language Models (LLMs) demand significant computational resources, making it essential to enhance their capabilities without retraining from scratch. A key challenge in this domain is catastrophic forgetting (CF), which hampers performance during Continuous Pre-training (CPT) and Continuous Supervised Fine-Tuning (CSFT). We propose Control LLM, a novel approach that leverages parallel pre-trained and expanded transformer blocks, aligning their hidden-states through interpolation strategies This method effectively preserves performance on existing tasks while seamlessly integrating new knowledge. Extensive experiments demonstrate the effectiveness of Control LLM in both CPT and CSFT. On Llama3.1-8B-Instruct, it achieves significant improvements in mathematical reasoning (+14.4% on Math-Hard) and coding performance (+10% on MBPP-PLUS). On Llama3.1-8B, it enhances multilingual capabilities (+10.6% on C-Eval, +6.8% on CMMLU, and +30.2% on CMMLU-0shot-CoT). It surpasses existing methods and achieves SOTA among open-source models tuned from the same base model, using substantially less data and compute. Crucially, these gains are realized while preserving strong original capabilities, with minimal degradation (<4.3% on MMLU) compared to >35% in open-source Math and Coding models. This approach has been successfully deployed in LinkedIn's GenAI-powered job seeker and Ads unit products. To support further research, we release the training and evaluation code (https://github.com/linkedin/ControlLLM) along with models trained on public datasets ( https://huggingface.co/ControlLLM) to the community.

View Paper