MCTS-RAG: Enhancing Retrieval-Augmented Generation with Monte Carlo Tree Search

Yunhai Hu, Yilun Zhao, Chen Zhao, Arman Cohan

2025-03-27

MCTS-RAG: Enhancing Retrieval-Augmented Generation with Monte Carlo Tree
Search

Summary

This paper is about improving the ability of AI models to answer questions by combining two techniques: searching for relevant information and using a tree-based search method to refine the answer.

What's the problem?

Small AI models often struggle to answer complex questions that require a lot of knowledge, and they can sometimes make up facts.

What's the solution?

The researchers developed a new method called MCTS-RAG that combines information retrieval with a tree search algorithm, allowing the AI to find relevant information and reason about it more effectively.

Why it matters?

This work matters because it can help small AI models answer complex questions more accurately and reliably.

Abstract

We introduce MCTS-RAG, a novel approach that enhances the reasoning capabilities of small language models on knowledge-intensive tasks by leveraging retrieval-augmented generation (RAG) to provide relevant context and Monte Carlo Tree Search (MCTS) to refine reasoning paths. MCTS-RAG dynamically integrates retrieval and reasoning through an iterative decision-making process. Unlike standard RAG methods, which typically retrieve information independently from reasoning and thus integrate knowledge suboptimally, or conventional MCTS reasoning, which depends solely on internal model knowledge without external facts, MCTS-RAG combines structured reasoning with adaptive retrieval. This integrated approach enhances decision-making, reduces hallucinations, and ensures improved factual accuracy and response consistency. The experimental results on multiple reasoning and knowledge-intensive datasets datasets (i.e., ComplexWebQA, GPQA, and FoolMeTwice) show that our method enables small-scale LMs to achieve performance comparable to frontier LLMs like GPT-4o by effectively scaling inference-time compute, setting a new standard for reasoning in small-scale models.

View Paper