WebCoach: Self-Evolving Web Agents with Cross-Session Memory Guidance

Genglin Liu, Shijie Geng, Sha Li, Hejie Cui, Sarah Zhang, Xin Liu, Tianyi Liu

2025-11-18

WebCoach: Self-Evolving Web Agents with Cross-Session Memory Guidance

Summary

This paper introduces WebCoach, a new system that helps AI agents get better at using the internet over time. These agents, powered by large language models, are designed to complete tasks online like booking flights or finding information, but they often make the same mistakes repeatedly.

What's the problem?

Current AI agents that browse the web are pretty good at completing tasks, but they struggle with remembering what they’ve learned from previous attempts. If an agent fails at a task, it usually starts from scratch the next time, even if the problem is the same. This means they aren't very efficient and can't reliably handle complex tasks that require learning from experience across different browsing sessions.

What's the solution?

WebCoach solves this by giving the AI agent a memory system. It works in three parts: first, it summarizes each browsing session into a short, understandable format. Second, it stores these summaries as experiences. Finally, a 'Coach' component looks back at these past experiences and, when the agent encounters a similar situation, provides helpful advice during the current task. This allows the agent to learn and improve without needing to be completely retrained every time.

Why it matters?

This research is important because it makes web-browsing AI agents much more reliable and efficient. By allowing them to learn from past mistakes and build on previous successes, WebCoach enables these agents to tackle more complex tasks and perform better, even when using less powerful underlying AI models. It essentially allows smaller AI models to perform as well as much larger ones, making this technology more accessible.

Abstract

Multimodal LLM-powered agents have recently demonstrated impressive capabilities in web navigation, enabling agents to complete complex browsing tasks across diverse domains. However, current agents struggle with repetitive errors and lack the ability to learn from past experiences across sessions, limiting their long-term robustness and sample efficiency. We introduce WebCoach, a model-agnostic self-evolving framework that equips web browsing agents with persistent cross-session memory, enabling improved long-term planning, reflection, and continual learning without retraining. WebCoach consists of three key components: (1) a WebCondenser, which standardizes raw navigation logs into concise summaries; (2) an External Memory Store, which organizes complete trajectories as episodic experiences; and (3) a Coach, which retrieves relevant experiences based on similarity and recency, and decides whether to inject task-specific advice into the agent via runtime hooks. This design empowers web agents to access long-term memory beyond their native context window, improving robustness in complex browsing tasks. Moreover, WebCoach achieves self-evolution by continuously curating episodic memory from new navigation trajectories, enabling agents to improve over time without retraining. Evaluations on the WebVoyager benchmark demonstrate that WebCoach consistently improves the performance of browser-use agents across three different LLM backbones. With a 38B model, it increases task success rates from 47% to 61% while reducing or maintaining the average number of steps. Notably, smaller base models with WebCoach achieve performance comparable to the same web agent using GPT-4o.

View Paper