Reinforcement Learning Improves Traversal of Hierarchical Knowledge in LLMs

Renfei Zhang, Manasa Kaniselvan, Niloofar Mireshghallah

2025-11-11

Reinforcement Learning Improves Traversal of Hierarchical Knowledge in LLMs

Summary

This research challenges the common belief that improving language models with reinforcement learning (RL) makes them better at reasoning but worse at remembering facts. The paper actually finds RL often *improves* a model's ability to recall specific knowledge, especially when that knowledge is organized in a complex way, like medical codes.

What's the problem?

Many people thought that when you use reinforcement learning to make language models better at complex tasks, they forget things they already knew. Specifically, it was assumed that RL-enhanced models would perform worse on tasks that simply require recalling memorized information. However, researchers noticed RL models were surprisingly good at recalling detailed, structured knowledge, and wanted to understand why.

What's the solution?

The researchers investigated this by testing how well models recalled medical codes and related information. They found that RL didn't necessarily *teach* the models new facts, but instead improved their ability to *navigate* the existing knowledge already stored within the model. They showed that giving regular models (trained with supervised learning) specific instructions on how to search for information – essentially, 'walking' them through the knowledge structure – could close much of the performance gap with the RL models. Further analysis of the model's internal workings revealed that RL primarily changes *how* the model searches for information, not the information itself.

Why it matters?

This is important because it changes how we think about reinforcement learning for language models. It suggests that RL isn't just about adding new knowledge, but about making models more efficient at using the knowledge they already have. This could lead to better ways of training language models, especially for tasks that require accessing and applying large amounts of structured information, like in healthcare or law.

Abstract

Reinforcement learning (RL) is often credited with improving language model reasoning and generalization at the expense of degrading memorized knowledge. We challenge this narrative by observing that RL-enhanced models consistently outperform their base and supervised fine-tuned (SFT) counterparts on pure knowledge recall tasks, particularly those requiring traversal of hierarchical, structured knowledge (e.g., medical codes). We hypothesize these gains stem not from newly acquired data, but from improved procedural skills in navigating and searching existing knowledge hierarchies within the model parameters. To support this hypothesis, we show that structured prompting, which explicitly guides SFTed models through hierarchical traversal, recovers most of the performance gap (reducing 24pp to 7pp on MedConceptsQA for DeepSeek-V3/R1). We further find that while prompting improves final-answer accuracy, RL-enhanced models retain superior ability to recall correct procedural paths on deep-retrieval tasks. Finally our layer-wise internal activation analysis reveals that while factual representations (e.g., activations for the statement "code 57.95 refers to urinary infection") maintain high cosine similarity between SFT and RL models, query representations (e.g., "what is code 57.95") diverge noticeably, indicating that RL primarily transforms how models traverse knowledge rather than the knowledge representation itself.

View Paper