Data Mixing Agent: Learning to Re-weight Domains for Continual Pre-training
Kailai Yang, Xiao Liu, Lei Ji, Hao Li, Yeyun Gong, Peng Cheng, Mao Yang
2025-07-22
Summary
This paper talks about Data Mixing Agent, a new system that uses reinforcement learning to smartly adjust the importance of different types of training data during continual pre-training of large language models.
What's the problem?
The problem is that when large language models keep learning new skills, it's hard to balance learning from both general data and specific target data without losing performance in either area.
What's the solution?
The authors designed Data Mixing Agent to learn how to re-weight or prioritize training data from different sources dynamically while training, so the model improves on target tasks like math reasoning but still keeps good performance on general language skills.
Why it matters?
This matters because it helps AI models learn more efficiently and effectively across multiple tasks, improving their abilities without needing excessive training data or losing their general knowledge.
Abstract
Data Mixing Agent, a model-based framework using reinforcement learning, effectively re-weights training data to balance performance across source and target fields in continual pre-training of large language models.