ConvSearch-R1: Enhancing Query Reformulation for Conversational Search with Reasoning via Reinforcement Learning

Changtai Zhu, Siyin Wang, Ruijun Feng, Kai Song, Xipeng Qiu

2025-05-22

ConvSearch-R1: Enhancing Query Reformulation for Conversational Search
with Reasoning via Reinforcement Learning

Summary

This paper talks about ConvSearch-R1, a new AI system that makes it easier for search engines to understand and improve the questions people ask during a conversation, so the search results are more accurate and helpful.

What's the problem?

When people use conversational search, their questions often depend on earlier parts of the conversation, which can make them confusing, incomplete, or unclear. Traditional methods for fixing these questions usually need a lot of human help or expensive AI models, and they still don't always work well with the search engines that actually find the answers.

What's the solution?

The researchers created ConvSearch-R1, which uses reinforcement learning and a process called self-distillation to teach itself how to rewrite confusing questions into clear ones, without needing any outside help or supervision. It does this by learning directly from how well its rewritten questions help the search engine find the right answers, and it uses a two-step process to get started and keep improving.

Why it matters?

This matters because it means search engines can become much smarter and more cost-effective, giving people better answers in conversations without needing tons of extra data, human effort, or huge AI models.

Abstract

ConvSearch-R1 uses reinforcement learning and self-distillation to improve conversational query reformulation without relying on external supervision, outperforming state-of-the-art methods.

View Paper