< Explain other AI papers

MPO: Boosting LLM Agents with Meta Plan Optimization

Weimin Xiong, Yifan Song, Qingxiu Dong, Bingchan Zhao, Feifan Song, Xun Wang, Sujian Li

2025-03-05

MPO: Boosting LLM Agents with Meta Plan Optimization

Summary

This paper talks about a new way to make AI agents better at planning and completing tasks called Meta Plan Optimization (MPO). It helps AI agents work more efficiently and handle new situations without needing to be retrained every time.

What's the problem?

Current AI agents that use large language models (LLMs) can sometimes make mistakes when planning tasks, which is called hallucination. They also need to be retrained for each new task, which takes a lot of time and effort.

What's the solution?

The researchers created MPO, which gives AI agents high-level guidance called meta plans. These meta plans help the agents plan better without needing complicated instructions. MPO also learns from how well the agent does its tasks and keeps improving the meta plans. This means the AI gets better at planning over time without needing to be completely retrained.

Why it matters?

This matters because it makes AI agents more reliable and flexible. They can now handle a wider range of tasks more efficiently, even ones they haven't seen before. This could lead to AI assistants that are more helpful in various situations, from customer service to complex problem-solving, without requiring constant updates or retraining.

Abstract

Recent advancements in large language models (LLMs) have enabled LLM-based agents to successfully tackle interactive planning tasks. However, despite their successes, existing approaches often suffer from planning hallucinations and require retraining for each new agent. To address these challenges, we propose the Meta Plan Optimization (MPO) framework, which enhances agent planning capabilities by directly incorporating explicit guidance. Unlike previous methods that rely on complex knowledge, which either require significant human effort or lack quality assurance, MPO leverages high-level general guidance through meta plans to assist agent planning and enables continuous optimization of the meta plans based on feedback from the agent's task execution. Our experiments conducted on two representative tasks demonstrate that MPO significantly outperforms existing baselines. Moreover, our analysis indicates that MPO provides a plug-and-play solution that enhances both task completion efficiency and generalization capabilities in previous unseen scenarios.