REDSearcher: A Scalable and Cost-Efficient Framework for Long-Horizon Search Agents

Zheng Chu, Xiao Wang, Jack Hong, Huiming Fan, Yuqi Huang, Yue Yang, Guohai Xu, Chenxiao Zhao, Cheng Xiang, Shengchao Hu, Dongdong Kuang, Ming Liu, Bing Qin, Xing Yu

2026-02-17

REDSearcher: A Scalable and Cost-Efficient Framework for Long-Horizon Search Agents

Summary

This paper focuses on making large language models better at complex search tasks, moving beyond just knowing information to actually solving problems that require searching and using tools.

What's the problem?

Currently, it's really hard to train these models to be good at searching because getting useful examples for them to learn from is difficult and expensive. Creating complex tasks for them to practice is challenging, and every time they try to solve a problem, it takes a lot of computing power to see what happens when they use external tools like search engines. Essentially, there's a lack of good training data and the training process itself is slow and costly.

What's the solution?

The researchers developed a system called REDSearcher that tackles this problem in a few key ways. First, they figured out a way to automatically create complex search tasks that are just the right level of difficulty. Second, they encourage the model to actively *use* tools instead of just remembering facts. Third, they improved the model's basic skills – like understanding information, planning steps, and using functions – before letting it tackle the full search tasks, which makes learning more efficient. Finally, they created a simulated environment to quickly test and refine their approach without needing to constantly interact with real-world tools.

Why it matters?

This work is important because it significantly improves the ability of large language models to perform complex searches, bringing us closer to having AI assistants that can genuinely help us solve real-world problems. The researchers are also sharing their training data and code, which will help other researchers build on this work and further advance the field of AI search agents.

Abstract

Large language models are transitioning from generalpurpose knowledge engines to realworld problem solvers, yet optimizing them for deep search tasks remains challenging. The central bottleneck lies in the extreme sparsity of highquality search trajectories and reward signals, arising from the difficulty of scalable longhorizon task construction and the high cost of interactionheavy rollouts involving external tool calls. To address these challenges, we propose REDSearcher, a unified framework that codesigns complex task synthesis, midtraining, and posttraining for scalable searchagent optimization. Specifically, REDSearcher introduces the following improvements: (1) We frame task synthesis as a dualconstrained optimization, where task difficulty is precisely governed by graph topology and evidence dispersion, allowing scalable generation of complex, highquality tasks. (2) We introduce toolaugmented queries to encourage proactive tool use rather than passive recall.(3) During midtraining, we strengthen core atomic capabilities knowledge, planning, and function calling substantially reducing the cost of collecting highquality trajectories for downstream training. (4) We build a local simulated environment that enables rapid, lowcost algorithmic iteration for reinforcement learning experiments. Across both textonly and multimodal searchagent benchmarks, our approach achieves stateoftheart performance. To facilitate future research on longhorizon search agents, we will release 10K highquality complex text search trajectories, 5K multimodal trajectories and 1K text RL query set, and together with code and model checkpoints.

View Paper