Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning
Peng Xia, Kaide Zeng, Jiaqi Liu, Can Qin, Fang Wu, Yiyang Zhou, Caiming Xiong, Huaxiu Yao
2025-11-21
Summary
This paper introduces a new way to train AI agents, called Agent0, to become smarter without needing humans to constantly provide them with data or examples.
What's the problem?
Currently, training AI agents, especially those using Large Language Models, relies heavily on humans creating datasets for them to learn from. This is slow, doesn't scale well, and limits the AI to what humans already know. While some systems try to let AI learn on its own, they often struggle to create truly challenging learning experiences or effectively use tools to solve problems, because they only interact in a single step.
What's the solution?
The researchers created Agent0, a system where two AI agents work against each other. One agent, the 'curriculum agent,' designs increasingly difficult tasks. The other agent, the 'executor agent,' tries to solve those tasks. Importantly, the executor agent can also use external tools to help it. As the executor gets better at solving problems with tools, the curriculum agent has to create even harder, more complex tasks that require clever tool use. This creates a cycle where both agents constantly improve each other, leading to a smarter overall system.
Why it matters?
This research is important because it shows a path towards creating AI agents that can learn and improve on their own, without constant human intervention. By allowing the AI to generate its own learning experiences and utilize tools, Agent0 significantly boosts reasoning abilities, making the AI more capable and adaptable. This could lead to more powerful and versatile AI systems in the future.
Abstract
Large Language Model (LLM) Agents, often trained with Reinforcement Learning (RL), are constrained by a dependency on human-curated data, limiting scalability and tethering AI to human knowledge. Existing self-evolution frameworks offer an alternative but are typically restricted by the model's inherent capabilities and single-round interactions, hindering the development of complex curricula involving tool use or dynamic reasoning. We introduce Agent0, a fully autonomous framework that evolves high-performing agents without external data through multi-step co-evolution and seamless tool integration. Agent0 establishes a symbiotic competition between two agents initialized from the same base LLM: a curriculum agent that proposes increasingly challenging frontier tasks, and an executor agent that learns to solve them. We integrate external tools to enhance the executor's problem-solving capacity; this improvement, in turn, pressures the curriculum agent to construct more complex, tool-aware tasks. Through this iterative process, Agent0 establishes a self-reinforcing cycle that continuously produces high-quality curricula. Empirically, Agent0 substantially boosts reasoning capabilities, improving the Qwen3-8B-Base model by 18% on mathematical reasoning and 24% on general reasoning benchmarks. Code is available at https://github.com/aiming-lab/Agent0.