When Thoughts Meet Facts: Reusable Reasoning for Long-Context LMs

Soyeong Jeong, Taehee Jung, Sung Ju Hwang, Joo-Kyung Kim, Dongyeop Kang

2025-10-10

When Thoughts Meet Facts: Reusable Reasoning for Long-Context LMs

Summary

This paper introduces a new method, called ToTAL, for improving how long-context language models (LCLMs) use information from many sources to solve complex problems that require multiple steps of reasoning.

What's the problem?

Large language models are getting better at handling huge amounts of text at once, which is great for tasks needing lots of information. However, just giving them more documents doesn't automatically mean they'll understand *how* all the information connects together to reach a conclusion. They struggle to effectively combine evidence from different sources to perform multi-step reasoning.

What's the solution?

The researchers developed 'thought templates' which are essentially pre-built structures for reasoning. These templates show the model *how* to combine evidence, like a guide for connecting the dots. They create these templates by looking at how problems were solved before and then refine them using feedback in plain language. This allows the model to learn better ways to use the information it's given.

Why it matters?

This work is important because it makes long-context language models much more effective at complex reasoning tasks. It not only improves performance on existing models but also shows that these improved reasoning abilities can be transferred to smaller, more accessible models, making advanced reasoning more widely available and understandable.

Abstract

Recent Long-Context Language Models (LCLMs) can process hundreds of thousands of tokens in a single prompt, enabling new opportunities for knowledge-intensive multi-hop reasoning by integrating large sets of retrieved documents or, in some cases, directly all necessary information. However, simply feeding more documents into the context window fails to capture how evidence should be connected. We address this gap with thought templates, which recast reasoning as reusable thought caches, derived from prior problem solving traces, structuring how evidence is combined and guiding multi-hop inference with factual documents. To keep these templates effective, we propose an update strategy that iteratively refines templates derived from training data through natural-language feedback. Across diverse benchmarks and LCLM families, our approach delivers consistent gains over strong baselines in both retrieval-based and retrieval-free settings. Furthermore, we show that optimized templates can be distilled into smaller open-source models, demonstrating its broad applicability and transparent reasoning reuse. We refer to our framework as Thought Template Augmented LCLMs (ToTAL).

View Paper