Toward Efficient Agents: Memory, Tool learning, and Planning
Xiaofang Yang, Lijun Li, Heng Zhou, Tong Zhu, Xiaoye Qu, Yuchen Fan, Qianshan Wei, Rui Ye, Li Kang, Yiran Qin, Zhiqiang Kou, Daizong Liu, Qi Li, Ning Ding, Siheng Chen, Jing Shao
2026-01-21
Summary
This paper is a comprehensive look at how efficiently AI agents – programs powered by large language models that can take actions – operate. It's about making these agents practical for real-world use, not just improving how well they *can* perform, but also how quickly and cheaply they do it.
What's the problem?
AI agents are getting better at completing tasks, but they often use a lot of computing resources. This means they can be slow, expensive to run, and impractical for many applications. The core issue is that while researchers focus on making agents smarter, they often don't pay enough attention to making them efficient in terms of things like processing time, the number of steps they take, and the amount of data they need to process.
What's the solution?
The researchers reviewed a lot of recent work on improving agent efficiency, focusing on three key areas: how agents store and retrieve information (memory), how they learn to use different tools, and how they plan out their actions. They found that many different approaches actually share similar ideas, like compressing information to reduce the amount of data the agent needs to handle, designing rewards that encourage the agent to use tools sparingly, and using smarter search methods to find the best plan. They also looked at how to measure efficiency, comparing how well agents perform for a given cost, and how much it costs to achieve a certain level of performance.
Why it matters?
This work is important because efficient AI agents are essential for making this technology useful in everyday life. If agents are too slow or expensive, they won't be practical for things like customer service, automated research, or controlling robots. By identifying common strategies and outlining key challenges, this paper helps guide future research towards building AI agents that are both powerful *and* affordable.
Abstract
Recent years have witnessed increasing interest in extending large language models into agentic systems. While the effectiveness of agents has continued to improve, efficiency, which is crucial for real-world deployment, has often been overlooked. This paper therefore investigates efficiency from three core components of agents: memory, tool learning, and planning, considering costs such as latency, tokens, steps, etc. Aimed at conducting comprehensive research addressing the efficiency of the agentic system itself, we review a broad range of recent approaches that differ in implementation yet frequently converge on shared high-level principles including but not limited to bounding context via compression and management, designing reinforcement learning rewards to minimize tool invocation, and employing controlled search mechanisms to enhance efficiency, which we discuss in detail. Accordingly, we characterize efficiency in two complementary ways: comparing effectiveness under a fixed cost budget, and comparing cost at a comparable level of effectiveness. This trade-off can also be viewed through the Pareto frontier between effectiveness and cost. From this perspective, we also examine efficiency oriented benchmarks by summarizing evaluation protocols for these components and consolidating commonly reported efficiency metrics from both benchmark and methodological studies. Moreover, we discuss the key challenges and future directions, with the goal of providing promising insights.