Scaling Agents via Continual Pre-training

Liangcai Su, Zhen Zhang, Guangyu Li, Zhuo Chen, Chenxi Wang, Maojia Song, Xinyu Wang, Kuan Li, Jialong Wu, Xuanzhong Chen, Zile Qiao, Zhongwang Zhang, Huifeng Yin, Shihao Cai, Runnan Fang, Zhengwei Tao, Wenbiao Yin, Chenxiong Qian, Yong Jiang, Pengjun Xie, Fei Huang, Jingren Zhou

2025-09-17

Scaling Agents via Continual Pre-training

Summary

This paper focuses on improving how well large language models can act as 'agents' – meaning they can independently use tools and think through complex problems to achieve goals. They introduce a new way to train these models to be better at these agentic tasks.

What's the problem?

Currently, when researchers try to make existing, general-purpose language models into agents, they don't perform as well as they could, especially with freely available, open-source models. The issue is that these models are being asked to learn *how* to be an agent (like using tools) *and* to follow instructions at the same time. This creates conflicting demands during training, making it hard for them to excel at either.

What's the solution?

The researchers propose a new training method called Agentic Continual Pre-training, or Agentic CPT. This involves specifically training a model from the ground up to be a good agent *before* trying to teach it specific tasks. They used this method to create a model called AgentFounder-30B and showed it performs exceptionally well on a variety of agentic benchmarks.

Why it matters?

This work is important because it identifies a key limitation in current approaches to building AI agents and offers a solution that significantly improves performance. By creating a strong 'agentic foundation,' the model is better equipped to handle complex tasks requiring tool use and reasoning, pushing the boundaries of what AI agents can achieve.

Abstract

Large language models (LLMs) have evolved into agentic systems capable of autonomous tool use and multi-step reasoning for complex problem-solving. However, post-training approaches building upon general-purpose foundation models consistently underperform in agentic tasks, particularly in open-source implementations. We identify the root cause: the absence of robust agentic foundation models forces models during post-training to simultaneously learn diverse agentic behaviors while aligning them to expert demonstrations, thereby creating fundamental optimization tensions. To this end, we are the first to propose incorporating Agentic Continual Pre-training (Agentic CPT) into the deep research agents training pipeline to build powerful agentic foundational models. Based on this approach, we develop a deep research agent model named AgentFounder. We evaluate our AgentFounder-30B on 10 benchmarks and achieve state-of-the-art performance while retains strong tool-use ability, notably 39.9% on BrowseComp-en, 43.3% on BrowseComp-zh, and 31.5% Pass@1 on HLE.

View Paper