Web2BigTable: A Bi-Level Multi-Agent LLM System for Internet-Scale Information Search and Extraction

Yuxuan Huang, Yihang Chen, Zhiyuan He, Yuxiang Chen, Ka Yiu Lee, Huichi Zhou, Weilin Luo, Meng Fang, Jun Wang

2026-05-04

Web2BigTable: A Bi-Level Multi-Agent LLM System for Internet-Scale Information Search and Extraction

Summary

This paper introduces a new system called Web2BigTable designed to improve how computers perform web searches that require either a lot of detailed reasoning about a single topic, or gathering information from many different sources and organizing it neatly.

What's the problem?

Current web search systems struggle with two main types of tasks. First, they have trouble with tasks that need deep understanding and logical steps to find an answer. Second, they aren't great at collecting information about many different things and presenting it in a structured way, like a table, while making sure everything is consistent. Essentially, they can't easily do both broad, organized searches *and* in-depth investigations.

What's the solution?

Web2BigTable uses a team of 'agent' programs working together. One 'orchestrator' agent breaks down a complex search into smaller, manageable pieces, and then assigns those pieces to 'worker' agents. These worker agents search the web in parallel. They share what they find with each other to avoid repeating work, resolve disagreements, and fill in gaps in their knowledge. Importantly, the system learns from its mistakes and improves over time by remembering past searches and updating how it approaches new problems, all in a way that humans can understand.

Why it matters?

This research is important because it significantly improves the performance of web search for complex tasks. Web2BigTable achieves much better results than previous systems on tasks requiring both broad information gathering and deep reasoning, meaning it can provide more accurate and comprehensive answers to difficult questions. This could lead to more powerful and helpful search engines and AI assistants.

Abstract

Agentic web search increasingly faces two distinct demands: deep reasoning over a single target, and structured aggregation across many entities and heterogeneous sources. Current systems struggle on both fronts. Breadth-oriented tasks demand schema-aligned outputs with wide coverage and cross-entity consistency, while depth-oriented tasks require coherent reasoning over long, branching search trajectories. We introduce Web2BigTable, a multi-agent framework for web-to-table search that supports both regimes. Web2BigTable adopts a bi-level architecture in which an upper-level orchestrator decomposes the task into sub-problems and lower-level worker agents solve them in parallel. Through a closed-loop run--verify--reflect process, the framework jointly improves decomposition and execution over time via persistent, human-readable external memory, with self-evolving updates to each single-agent. During execution, workers coordinate through a shared workspace that makes partial findings visible, allowing them to reduce redundant exploration, reconcile conflicting evidence, and adapt to emerging coverage gaps. Web2BigTable sets a new state of the art on WideSearch, reaching an Avg@4 Success Rate of 38.50 (7.5times the second best at 5.10), Row F1 of 63.53 (+25.03 over the second best), and Item F1 of 80.12 (+14.42 over the second best). It also generalises to depth-oriented search on XBench-DeepSearch, achieving 73.0 accuracy. Code is available at https://github.com/web2bigtable/web2bigtable.

View Paper