DeepAgent: A General Reasoning Agent with Scalable Toolsets
Xiaoxi Li, Wenxiang Jiao, Jiarui Jin, Guanting Dong, Jiajie Jin, Yinuo Wang, Hao Wang, Yutao Zhu, Ji-Rong Wen, Yuan Lu, Zhicheng Dou
2025-10-27
Summary
This paper introduces DeepAgent, a new type of AI agent designed to handle complex, real-world tasks that require using various tools and making decisions over a long period of time.
What's the problem?
Current AI agents often struggle with tasks that need them to use external tools like search engines or calculators, and they have trouble remembering everything that happened in a long series of steps. Existing systems usually follow a rigid plan, which limits their ability to adapt and complete tasks independently. Specifically, keeping track of all the past interactions (the 'context') gets too long and messy, leading to errors and making it hard to focus on what's important.
What's the solution?
The researchers created DeepAgent, which thinks, finds tools, and takes actions all within one process. To deal with the problem of long interactions, they developed a 'memory folding' technique. This technique smartly compresses past interactions into different types of memory – like remembering specific events, what's currently important, and how to use tools – without losing crucial information. They also created a new way to train the agent, called ToolPO, which uses simulated tools and rewards the agent for correctly choosing which tools to use at each step.
Why it matters?
This work is important because it represents a step towards building more versatile and capable AI agents that can handle real-world problems more effectively. By allowing agents to think more autonomously and manage long-term interactions, DeepAgent opens the door to AI systems that can truly assist us with complex tasks in areas like shopping, planning, and interacting with the web.
Abstract
Large reasoning models have demonstrated strong problem-solving abilities, yet real-world tasks often require external tools and long-horizon interactions. Existing agent frameworks typically follow predefined workflows, which limit autonomous and global task completion. In this paper, we introduce DeepAgent, an end-to-end deep reasoning agent that performs autonomous thinking, tool discovery, and action execution within a single, coherent reasoning process. To address the challenges of long-horizon interactions, particularly the context length explosion from multiple tool calls and the accumulation of interaction history, we introduce an autonomous memory folding mechanism that compresses past interactions into structured episodic, working, and tool memories, reducing error accumulation while preserving critical information. To teach general-purpose tool use efficiently and stably, we develop an end-to-end reinforcement learning strategy, namely ToolPO, that leverages LLM-simulated APIs and applies tool-call advantage attribution to assign fine-grained credit to the tool invocation tokens. Extensive experiments on eight benchmarks, including general tool-use tasks (ToolBench, API-Bank, TMDB, Spotify, ToolHop) and downstream applications (ALFWorld, WebShop, GAIA, HLE), demonstrate that DeepAgent consistently outperforms baselines across both labeled-tool and open-set tool retrieval scenarios. This work takes a step toward more general and capable agents for real-world applications. The code and demo are available at https://github.com/RUC-NLPIR/DeepAgent.