SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning

Peng Xia, Jianwen Chen, Hanyang Wang, Jiaqi Liu, Kaide Zeng, Yu Wang, Siwei Han, Yiyang Zhou, Xujiang Zhao, Haifeng Chen, Zeyu Zheng, Cihang Xie, Huaxiu Yao

2026-02-11

SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning

Summary

This paper introduces SkillRL, a new way to help AI agents powered by large language models learn and improve over time by remembering and reusing successful strategies.

What's the problem?

Current AI agents using large language models are good at solving complex problems, but they often start from scratch each time, forgetting what they've learned previously. Existing methods of helping them remember just store everything that happened, which is inefficient because a lot of that information is useless or repetitive, making it hard for the AI to identify and reuse truly helpful patterns.

What's the solution?

SkillRL tackles this by automatically identifying and organizing useful 'skills' from the agent's experiences into a 'SkillBank'. It then cleverly retrieves the right skills when needed, adapting them to the specific task at hand. Importantly, the skill library isn't fixed; it continuously improves alongside the agent's overall problem-solving ability through a process called recursive evolution. This makes the system more efficient and better at reasoning.

Why it matters?

This research is important because it allows AI agents to become much more effective and efficient learners. By building a reusable library of skills, SkillRL significantly boosts performance on various tasks, like those involving interacting with websites or completing instructions in a virtual environment, and it maintains its effectiveness even as the tasks become more challenging. This is a step towards creating AI that can truly learn and adapt like humans.

Abstract

Large Language Model (LLM) agents have shown stunning results in complex tasks, yet they often operate in isolation, failing to learn from past experiences. Existing memory-based methods primarily store raw trajectories, which are often redundant and noise-heavy. This prevents agents from extracting high-level, reusable behavioral patterns that are essential for generalization. In this paper, we propose SkillRL, a framework that bridges the gap between raw experience and policy improvement through automatic skill discovery and recursive evolution. Our approach introduces an experience-based distillation mechanism to build a hierarchical skill library SkillBank, an adaptive retrieval strategy for general and task-specific heuristics, and a recursive evolution mechanism that allows the skill library to co-evolve with the agent's policy during reinforcement learning. These innovations significantly reduce the token footprint while enhancing reasoning utility. Experimental results on ALFWorld, WebShop and seven search-augmented tasks demonstrate that SkillRL achieves state-of-the-art performance, outperforming strong baselines over 15.3% and maintaining robustness as task complexity increases. Code is available at this https://github.com/aiming-lab/SkillRL.

View Paper