From RAG to Memory: Non-Parametric Continual Learning for Large Language Models

Bernal Jiménez Gutiérrez, Yiheng Shu, Weijian Qi, Sizhe Zhou, Yu Su

2025-02-21

From RAG to Memory: Non-Parametric Continual Learning for Large Language
Models

Summary

This paper talks about HippoRAG 2, a new system that helps AI language models learn and remember information more like humans do. It improves on existing methods by making AI better at connecting ideas and remembering facts over time.

What's the problem?

Current AI language models struggle to learn new information continuously without forgetting what they already know. The main method used now, called RAG, is good at finding information but not great at connecting ideas or remembering basic facts as well as it should. It's like having a really smart friend who can look things up quickly but sometimes forgets how different facts relate to each other.

What's the solution?

The researchers created HippoRAG 2, which builds on previous work to make AI memory more human-like. It uses a special algorithm called Personalized PageRank and combines it with better ways of understanding text passages and using AI to filter information. This helps the AI connect ideas better and remember both facts and how they relate to each other. It's like giving that smart friend a better way to organize their thoughts and connect the dots between different pieces of information.

Why it matters?

This matters because it brings AI closer to thinking like humans do. HippoRAG 2 is 7% better at connecting related ideas than the best current systems, while also being great at remembering facts and understanding how things make sense together. This could lead to AI that can learn and adapt more naturally, which is important for tasks that require understanding complex information or keeping up with changing knowledge in fields like science or medicine.

Abstract

Our ability to continuously acquire, organize, and leverage knowledge is a key feature of human intelligence that AI systems must approximate to unlock their full potential. Given the challenges in continual learning with large language models (LLMs), retrieval-augmented generation (RAG) has become the dominant way to introduce new information. However, its reliance on vector retrieval hinders its ability to mimic the dynamic and interconnected nature of human long-term memory. Recent RAG approaches augment vector embeddings with various structures like knowledge graphs to address some of these gaps, namely sense-making and associativity. However, their performance on more basic factual memory tasks drops considerably below standard RAG. We address this unintended deterioration and propose HippoRAG 2, a framework that outperforms standard RAG comprehensively on factual, sense-making, and associative memory tasks. HippoRAG 2 builds upon the Personalized PageRank algorithm used in HippoRAG and enhances it with deeper passage integration and more effective online use of an LLM. This combination pushes this RAG system closer to the effectiveness of human long-term memory, achieving a 7% improvement in associative memory tasks over the state-of-the-art embedding model while also exhibiting superior factual knowledge and sense-making memory capabilities. This work paves the way for non-parametric continual learning for LLMs. Our code and data will be released at https://github.com/OSU-NLP-Group/HippoRAG.

View Paper