Youtu-Agent: Scaling Agent Productivity with Automated Generation and Hybrid Policy Optimization

Yuchen Shi, Yuzheng Cai, Siqi Cai, Zihan Xu, Lichao Chen, Yulei Qin, Zhijian Zhou, Xiang Fei, Chaofan Qiu, Xiaoyu Tan, Gang Li, Zongyi Li, Haojia Lin, Guocan Cai, Yong Mao, Yunsheng Wu, Ke Li, Xing Sun

2026-01-05

Youtu-Agent: Scaling Agent Productivity with Automated Generation and Hybrid Policy Optimization

Summary

This paper introduces Youtu-Agent, a new system for building and improving AI agents powered by large language models. It aims to make creating these agents easier and more adaptable to changing situations.

What's the problem?

Currently, building good AI agents is hard work. It requires a lot of manual setup to connect the agent to different tools and carefully craft the instructions it follows. Once deployed, these agents aren't great at handling new or unexpected situations without being completely retrained, which is expensive and time-consuming.

What's the solution?

Youtu-Agent solves this by being a flexible, modular system. It separates how the agent operates from the tools it uses and how it remembers information. It can automatically create agents in two ways: one for simple tasks and another for more complex ones, even writing the necessary code and instructions itself. Crucially, it also has two ways to improve: 'Agent Practice' lets agents learn from experience without changing their core programming, and 'Agent RL' uses a more advanced learning technique to improve performance on a large scale.

Why it matters?

This work is important because it makes AI agents more accessible and effective. By automating the creation process and allowing agents to continuously learn and adapt, Youtu-Agent can lead to more powerful and reliable AI systems that can handle a wider range of real-world problems. The results show significant improvements in performance on various tasks, and faster training times for the underlying language models.

Abstract

Existing Large Language Model (LLM) agent frameworks face two significant challenges: high configuration costs and static capabilities. Building a high-quality agent often requires extensive manual effort in tool integration and prompt engineering, while deployed agents struggle to adapt to dynamic environments without expensive fine-tuning. To address these issues, we propose Youtu-Agent, a modular framework designed for the automated generation and continuous evolution of LLM agents. Youtu-Agent features a structured configuration system that decouples execution environments, toolkits, and context management, enabling flexible reuse and automated synthesis. We introduce two generation paradigms: a Workflow mode for standard tasks and a Meta-Agent mode for complex, non-standard requirements, capable of automatically generating tool code, prompts, and configurations. Furthermore, Youtu-Agent establishes a hybrid policy optimization system: (1) an Agent Practice module that enables agents to accumulate experience and improve performance through in-context optimization without parameter updates; and (2) an Agent RL module that integrates with distributed training frameworks to enable scalable and stable reinforcement learning of any Youtu-Agents in an end-to-end, large-scale manner. Experiments demonstrate that Youtu-Agent achieves state-of-the-art performance on WebWalkerQA (71.47\%) and GAIA (72.8\%) using open-weight models. Our automated generation pipeline achieves over 81\% tool synthesis success rate, while the Practice module improves performance on AIME 2024/2025 by +2.7\% and +5.4\% respectively. Moreover, our Agent RL training achieves 40\% speedup with steady performance improvement on 7B LLMs, enhancing coding/reasoning and searching capabilities respectively up to 35\% and 21\% on Maths and general/multi-hop QA benchmarks.

View Paper