SmartSearch: Process Reward-Guided Query Refinement for Search Agents

Tongyu Wen, Guanting Dong, Zhicheng Dou

2026-01-12

SmartSearch: Process Reward-Guided Query Refinement for Search Agents

Summary

This paper introduces a new system called SmartSearch that aims to improve how AI search agents find information. These agents use large language models to answer questions that require looking up facts, but often struggle because the search queries they create aren't very good.

What's the problem?

Current AI search agents are pretty good at *thinking* through a problem, but they aren't so good at actually *asking* the right questions to find the information they need. They generate search queries that are inaccurate or don't quite hit the mark, leading to irrelevant search results and ultimately, incorrect answers. The focus has been on improving the reasoning process, but not on the quality of the searches themselves.

What's the solution?

SmartSearch tackles this problem with two main ideas. First, it uses 'process rewards' to give feedback on how good each search query is as it's being created, essentially teaching the AI to write better questions. This feedback is determined by evaluating the query at two levels. Second, it 'refines' bad queries – instead of continuing with a poor search, it rewrites the question and tries again. To help the AI learn this process, the researchers used a training method that starts with showing the AI good examples, then guiding it to improve, and finally letting it generalize to new situations.

Why it matters?

This work is important because improving the search query quality directly leads to more effective AI agents. By making the searches more efficient and accurate, SmartSearch helps these agents find the right information faster and provide better, more reliable answers to complex questions. This is a step towards building AI systems that can truly leverage the vast amount of knowledge available online.

Abstract

Large language model (LLM)-based search agents have proven promising for addressing knowledge-intensive problems by incorporating information retrieval capabilities. Existing works largely focus on optimizing the reasoning paradigms of search agents, yet the quality of intermediate search queries during reasoning remains overlooked. As a result, the generated queries often remain inaccurate, leading to unexpected retrieval results and ultimately limiting search agents' overall effectiveness. To mitigate this issue, we introduce SmartSearch, a framework built upon two key mechanisms: (1) Process rewards, which provide fine-grained supervision for the quality of each intermediate search query through Dual-Level Credit Assessment. (2) Query refinement, which promotes the optimization of query generation by selectively refining low-quality search queries and regenerating subsequent search rounds based on these refinements. To enable the search agent to progressively internalize the ability to improve query quality under the guidance of process rewards, we design a three-stage curriculum learning framework. This framework guides the agent through a progression from imitation, to alignment, and ultimately to generalization. Experimental results show that SmartSearch consistently surpasses existing baselines, and additional quantitative analyses further confirm its significant gains in both search efficiency and query quality. The code is available at https://github.com/MYVAE/SmartSearch.

View Paper