SkillWeaver: Web Agents can Self-Improve by Discovering and Honing Skills
Boyuan Zheng, Michael Y. Fatemi, Xiaolong Jin, Zora Zhiruo Wang, Apurva Gandhi, Yueqi Song, Yu Gu, Jayanth Srinivasa, Gaowen Liu, Graham Neubig, Yu Su
2025-04-09
Summary
This paper talks about SkillWeaver, a system that helps AI web assistants learn and improve by automatically creating reusable tools (like mini-programs) as they explore new websites.
What's the problem?
Current AI web helpers struggle to learn from experience, can’t turn their actions into reusable skills, and often need to start from scratch on each new website.
What's the solution?
SkillWeaver lets AI agents explore websites, practice tasks, and package their successful actions into shareable tools (APIs) that can be reused later, building a growing library of skills.
Why it matters?
This makes AI web assistants smarter and faster over time, helping them handle complex tasks like online shopping or form-filling more reliably, and allows weaker AI systems to borrow skills from stronger ones.
Abstract
To survive and thrive in complex environments, humans have evolved sophisticated self-improvement mechanisms through environment exploration, hierarchical abstraction of experiences into reuseable skills, and collaborative construction of an ever-growing skill repertoire. Despite recent advancements, autonomous web agents still lack crucial self-improvement capabilities, struggling with procedural knowledge abstraction, refining skills, and skill composition. In this work, we introduce SkillWeaver, a skill-centric framework enabling agents to self-improve by autonomously synthesizing reusable skills as APIs. Given a new website, the agent autonomously discovers skills, executes them for practice, and distills practice experiences into robust APIs. Iterative exploration continually expands a library of lightweight, plug-and-play APIs, significantly enhancing the agent's capabilities. Experiments on WebArena and real-world websites demonstrate the efficacy of SkillWeaver, achieving relative success rate improvements of 31.8% and 39.8%, respectively. Additionally, APIs synthesized by strong agents substantially enhance weaker agents through transferable skills, yielding improvements of up to 54.3% on WebArena. These results demonstrate the effectiveness of honing diverse website interactions into APIs, which can be seamlessly shared among various web agents.