MemoRAG: Moving towards Next-Gen RAG Via Memory-Inspired Knowledge Discovery
Hongjin Qian, Peitian Zhang, Zheng Liu, Kelong Mao, Zhicheng Dou
2024-09-10

Summary
This paper talks about MemoRAG, a new approach that enhances Retrieval-Augmented Generation (RAG) by using memory-inspired techniques to improve how models handle complex information retrieval and generation tasks.
What's the problem?
Current RAG systems are limited because they can only match specific queries with well-defined knowledge. This makes them less effective for tasks that involve unclear or complicated information needs. As a result, they struggle with anything beyond simple question-answering tasks.
What's the solution?
MemoRAG introduces a dual-system architecture that uses two types of language models. The first model creates a broad memory of the database and generates initial draft answers based on a given task. Then, a second, more powerful model refines these drafts by retrieving relevant information from the database. This setup allows MemoRAG to generate high-quality responses even for complex queries, improving both retrieval and generation performance.
Why it matters?
This research is important because it addresses the limitations of existing RAG systems, making them more capable of handling a wider range of tasks. By improving how models retrieve and generate information together, MemoRAG can enhance applications in areas like customer support, content creation, and any situation where understanding complex queries is essential.
Abstract
Retrieval-Augmented Generation (RAG) leverages retrieval tools to access external databases, thereby enhancing the generation quality of large language models (LLMs) through optimized context. However, the existing retrieval methods are constrained inherently, as they can only perform relevance matching between explicitly stated queries and well-formed knowledge, but unable to handle tasks involving ambiguous information needs or unstructured knowledge. Consequently, existing RAG systems are primarily effective for straightforward question-answering tasks. In this work, we propose MemoRAG, a novel retrieval-augmented generation paradigm empowered by long-term memory. MemoRAG adopts a dual-system architecture. On the one hand, it employs a light but long-range LLM to form the global memory of database. Once a task is presented, it generates draft answers, cluing the retrieval tools to locate useful information within the database. On the other hand, it leverages an expensive but expressive LLM, which generates the ultimate answer based on the retrieved information. Building on this general framework, we further optimize MemoRAG's performance by enhancing its cluing mechanism and memorization capacity. In our experiment, MemoRAG achieves superior performance across a variety of evaluation tasks, including both complex ones where conventional RAG fails and straightforward ones where RAG is commonly applied.