WebWeaver: Structuring Web-Scale Evidence with Dynamic Outlines for Open-Ended Deep Research
Zijian Li, Xin Guan, Bo Zhang, Shen Huang, Houquan Zhou, Shaopeng Lai, Ming Yan, Yong Jiang, Pengjun Xie, Fei Huang, Jun Zhang, Jingren Zhou
2025-09-17
Summary
This paper introduces a new AI system called WebWeaver designed to perform complex research tasks by gathering information from the internet and writing detailed reports, much like a human researcher would.
What's the problem?
Current AI systems struggle with in-depth research because they typically follow a rigid process – first planning what to research, then finding information, and finally writing the report. This approach has two main weaknesses: the planning isn't flexible enough to change as new information is found, and generating very long reports often leads to errors, like losing focus in the middle or even making things up (hallucinations). Essentially, they can't handle the complexity of real-world research.
What's the solution?
WebWeaver uses a two-part system that mimics how humans research. First, a 'planner' agent continuously refines the research outline while actively searching for evidence and storing it in a memory bank. It doesn't just plan once at the beginning; it adjusts the plan as it learns. Then, a 'writer' agent uses this organized evidence to write the report section by section, only retrieving the specific information needed for each part. This focused approach avoids the problems of long-context failure because it doesn't try to process everything at once.
Why it matters?
This research is important because it significantly improves the ability of AI to conduct thorough and reliable research. WebWeaver outperforms existing systems on several standard research benchmarks, demonstrating that a more flexible, iterative, and focused approach is key to creating AI that can produce high-quality, well-supported reports, which has implications for many fields needing automated information synthesis.
Abstract
This paper tackles open-ended deep research (OEDR), a complex challenge where AI agents must synthesize vast web-scale information into insightful reports. Current approaches are plagued by dual-fold limitations: static research pipelines that decouple planning from evidence acquisition and one-shot generation paradigms that easily suffer from long-context failure issues like "loss in the middle" and hallucinations. To address these challenges, we introduce WebWeaver, a novel dual-agent framework that emulates the human research process. The planner operates in a dynamic cycle, iteratively interleaving evidence acquisition with outline optimization to produce a comprehensive, source-grounded outline linking to a memory bank of evidence. The writer then executes a hierarchical retrieval and writing process, composing the report section by section. By performing targeted retrieval of only the necessary evidence from the memory bank for each part, it effectively mitigates long-context issues. Our framework establishes a new state-of-the-art across major OEDR benchmarks, including DeepResearch Bench, DeepConsult, and DeepResearchGym. These results validate our human-centric, iterative methodology, demonstrating that adaptive planning and focused synthesis are crucial for producing high-quality, reliable, and well-structured reports.