Multi-Step Knowledge Interaction Analysis via Rank-2 Subspace Disentanglement

Sekh Mainul Islam, Pepa Atanasova, Isabelle Augenstein

2025-11-04

Multi-Step Knowledge Interaction Analysis via Rank-2 Subspace Disentanglement

Summary

This paper investigates how large language models (LLMs) explain their reasoning, specifically looking at where that reasoning comes from – both what they’ve learned during training (parametric knowledge) and information they’re given in the prompt (context knowledge).

What's the problem?

Currently, it’s hard to tell if an LLM’s explanation is actually based on the information it was given, or if it’s just making things up. Previous research only looked at the very last step of an explanation, and treated the relationship between learned knowledge and provided information as a simple on/off switch. This doesn’t capture the more complex ways these two types of knowledge can work together, like one supporting or building on the other.

What's the solution?

The researchers developed a new method to analyze how learned knowledge and provided information interact within an LLM’s explanation. They used a mathematical technique called a ‘rank-2 projection subspace’ which allows for a more detailed understanding of how these two knowledge sources contribute to each step of the explanation. They then applied this method to analyze explanations generated by several different LLMs on various question-answering tasks.

Why it matters?

This work is important because it provides a way to systematically check the accuracy and reliability of LLM explanations. By understanding how models use both their pre-existing knowledge and the information they’re given, we can better identify when they are ‘hallucinating’ (making things up) and build more trustworthy AI systems. It also shows how prompting techniques, like asking the model to ‘think step-by-step’, can influence whether the explanation relies more on provided information or learned knowledge.

Abstract

Natural Language Explanations (NLEs) describe how Large Language Models (LLMs) make decisions, drawing on both external Context Knowledge (CK) and Parametric Knowledge (PK) stored in model weights. Understanding their interaction is key to assessing the grounding of NLEs, yet it remains underexplored. Prior work has largely examined only single-step generation, typically the final answer, and has modelled PK and CK interaction only as a binary choice in a rank-1 subspace. This overlooks richer forms of interaction, such as complementary or supportive knowledge. We propose a novel rank-2 projection subspace that disentangles PK and CK contributions more accurately and use it for the first multi-step analysis of knowledge interactions across longer NLE sequences. Experiments on four QA datasets and three open-weight instruction-tuned LLMs show that diverse knowledge interactions are poorly represented in a rank-1 subspace but are effectively captured in our rank-2 formulation. Our multi-step analysis reveals that hallucinated NLEs align strongly with the PK direction, context-faithful ones balance PK and CK, and Chain-of-Thought prompting for NLEs shifts generated NLEs toward CK by reducing PK reliance. This work provides the first framework for systematic studies of multi-step knowledge interactions in LLMs through a richer rank-2 subspace disentanglement. Code and data: https://github.com/copenlu/pk-ck-knowledge-disentanglement.

View Paper