FABLE: Forest-Based Adaptive Bi-Path LLM-Enhanced Retrieval for Multi-Document Reasoning
Lin Sun, Linglin Zhang, Jingang Huang, Change Jia, Zhengwei Cheng, Xiangzheng Zhang
2026-01-28
Summary
This paper investigates whether we still need systems that 'augment' Large Language Models (LLMs) with information retrieval, now that LLMs can process very long pieces of text. It introduces a new system called FABLE that improves upon existing methods for combining LLMs and retrieval.
What's the problem?
While LLMs are getting better at handling long inputs, they still struggle with a few key issues. They sometimes 'lose the thread' when important information is buried in the middle of a long document, processing long texts is computationally expensive, and it's hard to make them effectively reason across multiple documents. Traditional methods of adding information to LLMs, called Retrieval-Augmented Generation or RAG, are efficient but often pull in irrelevant information and don't do a great job of combining information from different sources in a structured way.
What's the solution?
The researchers developed FABLE, which stands for Forest-based Adaptive Bi-path LLM-Enhanced retrieval. FABLE organizes information into a hierarchical structure, like a tree, using LLMs to understand the meaning at different levels of detail. When a question is asked, FABLE uses a two-pronged approach: it uses an LLM to navigate this structure and find relevant information, and it also spreads information through the structure to consider related concepts. Importantly, FABLE can adjust how much effort it puts into retrieval to balance accuracy and speed.
Why it matters?
This work shows that even with increasingly powerful LLMs, structured retrieval methods like FABLE are still valuable. FABLE achieves similar accuracy to using the entire long context of an LLM, but it does so using significantly fewer tokens (up to 94% less!), making it much more efficient. This means that long-context LLMs don't completely eliminate the need for smart ways to find and organize information.
Abstract
The rapid expansion of long-context Large Language Models (LLMs) has reignited debate on whether Retrieval-Augmented Generation (RAG) remains necessary. However, empirical evidence reveals persistent limitations of long-context inference, including the lost-in-the-middle phenomenon, high computational cost, and poor scalability for multi-document reasoning. Conversely, traditional RAG systems, while efficient, are constrained by flat chunk-level retrieval that introduces semantic noise and fails to support structured cross-document synthesis. We present FABLE, a Forest-based Adaptive Bi-path LLM-Enhanced retrieval framework that integrates LLMs into both knowledge organization and retrieval. FABLE constructs LLM-enhanced hierarchical forest indexes with multi-granularity semantic structures, then employs a bi-path strategy combining LLM-guided hierarchical traversal with structure-aware propagation for fine-grained evidence acquisition, with explicit budget control for adaptive efficiency trade-offs. Extensive experiments demonstrate that FABLE consistently outperforms SOTA RAG methods and achieves comparable accuracy to full-context LLM inference with up to 94\% token reduction, showing that long-context LLMs amplify rather than fully replace the need for structured retrieval.