AWorld: Orchestrating the Training Recipe for Agentic AI
Chengyue Yu, Siyuan Lu, Chenyi Zhuang, Dong Wang, Qintong Wu, Zongyue Li, Runsheng Gan, Chunfeng Wang, Siqi Hou, Gaochi Huang, Wenlong Yan, Lifeng Hong, Aohui Xue, Yanfeng Wang, Jinjie Gu, David Tsai, Tao Lin
2025-08-29
Summary
This paper focuses on improving how AI agents learn by doing, specifically by making the process of gathering experience much faster and more efficient.
What's the problem?
Creating truly intelligent AI agents that can act and learn in complex environments is difficult because it takes a huge amount of time for them to practice and gain experience. This is a major roadblock, especially when dealing with complicated tasks and environments like the GAIA benchmark, where simply getting enough practice data is a challenge.
What's the solution?
The researchers developed a system called AWorld that allows for massively parallel experience gathering. Imagine instead of one AI agent trying things out, many agents are working simultaneously across a network of computers. This speeds up the learning process dramatically – in this case, by almost 15 times! They then used AWorld to train a powerful AI model (Qwen3-32B) and showed it could learn to perform much better on the GAIA benchmark than it could before.
Why it matters?
This work is important because it provides a practical way to train more capable AI agents. By making experience gathering faster and more scalable, it opens the door to building AI systems that can tackle more complex real-world problems. Furthermore, they’ve released their system as open-source, meaning other researchers and developers can build upon their work and accelerate progress in the field of agentic AI.
Abstract
The learning from practice paradigm is crucial for developing capable Agentic AI systems, yet it is severely hampered by inefficient experience generation, a bottleneck especially pronounced in complex benchmarks like GAIA. To address this, we introduce AWorld, an open-source system engineered for large-scale agent-environment interaction. By distributing tasks across a cluster, AWorld accelerates experience collection by 14.6x compared to standard single-node, sequential execution. This critical speedup makes extensive reinforcement learning practical and scalable. Leveraging this capability, we trained a Qwen3-32B-based agent that significantly outperforms its base model, increasing its overall GAIA accuracy from 21.59% to 32.23%. On the benchmark's most challenging levels, our agent achieves a score of 16.33%, surpassing the performance of leading proprietary models. Our open-source system and resulting agent provide a practical blueprint for a complete agentic AI training pipeline, from efficient interaction to demonstrable model improvement.