Mindscape-Aware Retrieval Augmented Generation for Improved Long Context Understanding

Yuqing Li, Jiangnan Li, Zheng Lin, Ziyan Zhou, Junjie Wu, Weiping Wang, Jie Zhou, Mo Yu

2025-12-29

Mindscape-Aware Retrieval Augmented Generation for Improved Long Context Understanding

Summary

This paper introduces a new way to improve how computer systems understand and work with long pieces of text, aiming to make them more like how humans process information.

What's the problem?

Currently, computer systems that try to answer questions based on long documents, called Retrieval-Augmented Generation or RAG systems, struggle because they don't have a good overall understanding of the text's main ideas. They treat the document as a collection of separate pieces, rather than a unified whole, making it hard to connect information and reason effectively. This is unlike humans, who build a 'mindscape' – a global understanding – when reading.

What's the solution?

The researchers developed a system called MiA-RAG, which stands for Mindscape-Aware RAG. It works by first creating a summarized, hierarchical overview of the entire document, essentially building that 'mindscape'. Then, when the system needs to find information or answer a question, it uses this overview to guide its search and reasoning, helping it connect details to the bigger picture. This improves both finding relevant information and generating coherent answers.

Why it matters?

This research is important because it addresses a key limitation of current AI systems dealing with long texts. By enabling computers to understand the global context of a document, MiA-RAG can lead to more accurate, reliable, and human-like performance in tasks like answering complex questions, understanding research papers, or processing legal documents. It's a step towards AI that can truly comprehend information like we do.

Abstract

Humans understand long and complex texts by relying on a holistic semantic representation of the content. This global view helps organize prior knowledge, interpret new information, and integrate evidence dispersed across a document, as revealed by the Mindscape-Aware Capability of humans in psychology. Current Retrieval-Augmented Generation (RAG) systems lack such guidance and therefore struggle with long-context tasks. In this paper, we propose Mindscape-Aware RAG (MiA-RAG), the first approach that equips LLM-based RAG systems with explicit global context awareness. MiA-RAG builds a mindscape through hierarchical summarization and conditions both retrieval and generation on this global semantic representation. This enables the retriever to form enriched query embeddings and the generator to reason over retrieved evidence within a coherent global context. We evaluate MiA-RAG across diverse long-context and bilingual benchmarks for evidence-based understanding and global sense-making. It consistently surpasses baselines, and further analysis shows that it aligns local details with a coherent global representation, enabling more human-like long-context retrieval and reasoning.

View Paper