WebResearcher: Unleashing unbounded reasoning capability in Long-Horizon Agents
Zile Qiao, Guoxin Chen, Xuanzhong Chen, Donglei Yu, Wenbiao Yin, Xinyu Wang, Zhen Zhang, Baixuan Li, Huifeng Yin, Kuan Li, Rui Min, Minpeng Liao, Yong Jiang, Pengjun Xie, Fei Huang, Jingren Zhou
2025-09-17
Summary
This paper introduces WebResearcher, a new system that allows AI to conduct research online, gather information, and write reports, much like a human researcher would.
What's the problem?
Current AI systems struggle with complex research tasks because they often get overwhelmed by too much information or get distracted by irrelevant details. They typically handle information in one big chunk, which makes it hard to stay focused and build a coherent understanding over time. Essentially, they can't effectively 'think' through a research problem step-by-step.
What's the solution?
WebResearcher tackles this by breaking down research into a series of steps, like a process of asking questions, finding answers, and then summarizing what's been learned. It uses a system called a 'Markov Decision Process' to guide the AI, allowing it to refine its search and focus on the most important information. It also creates its own practice research problems to get better at using tools and building knowledge. The system can even have multiple AI agents working on the same problem at the same time to get a more complete picture.
Why it matters?
This work is important because it represents a significant step towards AI systems that can truly understand and contribute to knowledge creation. It shows that AI can move beyond simply recalling facts to actively researching and synthesizing new information, potentially leading to breakthroughs in many fields and even surpassing the performance of existing, highly advanced AI research tools.
Abstract
Recent advances in deep-research systems have demonstrated the potential for AI agents to autonomously discover and synthesize knowledge from external sources. In this paper, we introduce WebResearcher, a novel framework for building such agents through two key components: (1) WebResearcher, an iterative deep-research paradigm that reformulates deep research as a Markov Decision Process, where agents periodically consolidate findings into evolving reports while maintaining focused workspaces, overcoming the context suffocation and noise contamination that plague existing mono-contextual approaches; and (2) WebFrontier, a scalable data synthesis engine that generates high-quality training data through tool-augmented complexity escalation, enabling systematic creation of research tasks that bridge the gap between passive knowledge recall and active knowledge construction. Notably, we find that the training data from our paradigm significantly enhances tool-use capabilities even for traditional mono-contextual methods. Furthermore, our paradigm naturally scales through parallel thinking, enabling concurrent multi-agent exploration for more comprehensive conclusions. Extensive experiments across 6 challenging benchmarks demonstrate that WebResearcher achieves state-of-the-art performance, even surpassing frontier proprietary systems.