A Goal Without a Plan Is Just a Wish: Efficient and Effective Global Planner Training for Long-Horizon Agent Tasks

Shuzheng Si, Haozhe Zhao, Kangyang Luo, Gang Chen, Fanchao Qi, Minjia Zhang, Baobao Chang, Maosong Sun

2025-10-13

A Goal Without a Plan Is Just a Wish: Efficient and Effective Global Planner Training for Long-Horizon Agent Tasks

Summary

This paper addresses the challenges faced by AI agents powered by large language models when tackling complex tasks that require many steps to complete.

What's the problem?

AI agents using large language models often struggle with tasks that need a lot of planning because they tend to just guess and check, and sometimes they even make up information or actions that don't make sense. This happens because these agents lack a good way to think through the entire task before starting, leading to inefficient or incorrect results.

What's the solution?

The researchers developed a system called EAGLET that helps these agents plan better. It works in two main steps: first, it uses a powerful language model to create high-quality plans, filtering out bad options to ensure the plans are reliable. Then, it refines this planner using a type of learning called reinforcement learning, rewarding the planner when it helps the agent successfully complete tasks, even difficult ones. This planner can then be easily added to existing agents.

Why it matters?

This work is important because it significantly improves the performance of AI agents on complex tasks, allowing them to achieve better results than previous methods. It also makes training these agents much more efficient, reducing the cost by a large margin, and it doesn't require a lot of human input or extra data, making it a practical solution for building more capable AI systems.

Abstract

Agents based on large language models (LLMs) struggle with brainless trial-and-error and generating hallucinatory actions due to a lack of global planning in long-horizon tasks. In this paper, we introduce a plan-and-execute framework and propose EAGLET, an efficient and effective planner training method to enhance the executor agent's planning abilities without human effort. Specifically, we train a plug-and-play global planner through a two-step process: we first synthesize high-quality plans from an advanced LLM using our proposed homologous consensus filtering strategy, and apply fine-tuning as a cold start. Moreover, we further improve the planner with a rule-based reinforcement learning stage using a novel executor capability gain reward, ensuring it can handle task instructions of varying difficulty. Experiments on three long-horizon agent tasks show that executor agents equipped with our planner outperform existing methods, achieving new state-of-the-art performance. Meanwhile, EAGLET reduces training costs by 8x compared to RL-based baselines, and it does not require manual effort or extra training data, offering an efficient and effective solution.

View Paper