DeepRAG: Thinking to Retrieval Step by Step for Large Language Models

Xinyan Guan, Jiali Zeng, Fandong Meng, Chunlei Xin, Yaojie Lu, Hongyu Lin, Xianpei Han, Le Sun, Jie Zhou

2025-02-04

DeepRAG: Thinking to Retrieval Step by Step for Large Language Models

Summary

This paper talks about DeepRAG, a new method that helps large language models (LLMs) answer questions more accurately by carefully deciding when to use their own knowledge and when to search for extra information. It improves how these models think and retrieve information step by step.

What's the problem?

LLMs are good at reasoning but often make mistakes by giving false or outdated information, a problem called hallucination. When combined with retrieval-augmented generation (RAG), which lets them search for external data, the process can become messy because they might retrieve too much or irrelevant information, making their answers less reliable.

What's the solution?

The researchers created DeepRAG, a system that treats the process of answering questions like a decision-making game called a Markov Decision Process (MDP). DeepRAG breaks complex questions into smaller parts and decides for each part whether to use its internal knowledge or search for external information. This step-by-step approach makes retrieval more strategic and efficient, reducing errors and improving answer quality.

Why it matters?

This research is important because it makes AI systems like chatbots and virtual assistants more accurate and trustworthy. By improving answer accuracy by nearly 22%, DeepRAG helps reduce false information and ensures better performance in tasks that require combining reasoning with external knowledge. It’s a significant step forward in making AI smarter and more reliable for real-world applications.

Abstract

Large Language Models (LLMs) have shown remarkable potential in reasoning while they still suffer from severe factual hallucinations due to timeliness, accuracy, and coverage of parametric knowledge. Meanwhile, integrating reasoning with retrieval-augmented generation (RAG) remains challenging due to ineffective task decomposition and redundant retrieval, which can introduce noise and degrade response quality. In this paper, we propose DeepRAG, a framework that models retrieval-augmented reasoning as a Markov Decision Process (MDP), enabling strategic and adaptive retrieval. By iteratively decomposing queries, DeepRAG dynamically determines whether to retrieve external knowledge or rely on parametric reasoning at each step. Experiments show that DeepRAG improves retrieval efficiency while improving answer accuracy by 21.99%, demonstrating its effectiveness in optimizing retrieval-augmented reasoning.

View Paper