Provable Benefits of In-Tool Learning for Large Language Models
Sam Houliston, Ambroise Odonnat, Charles Arnal, Vivien Cabannes
2025-08-29
Summary
This paper investigates why giving AI models access to tools, like search engines, is better than just trying to make them memorize everything. It shows that tools allow AI to 'know' a practically unlimited amount of information, while there's a hard limit to how much an AI can store directly in its programming.
What's the problem?
AI models are getting really good, but they still struggle with remembering facts. There are two main ways to get them to 'know' things: you can change the model's internal settings (like its weights) to store facts directly, or you can give it tools to look up information when it needs it. The problem is, we didn't really understand *why* one approach might be better than the other, especially in terms of how much information each method could handle.
What's the solution?
The researchers proved mathematically that a model's ability to memorize facts is limited by its size – the more facts you want it to remember, the bigger the model needs to be. However, they also showed that if you give a model tools to access information, it can theoretically recall an unlimited number of facts. They then ran experiments where AI models using tools consistently outperformed models that were just trying to memorize things, and found that teaching models *how* to use tools is more effective than just feeding them facts to memorize.
Why it matters?
This work is important because it explains why tool-using AI is so powerful and why it's likely to become even more so. It's not just a practical improvement, but a fundamental one. It means that instead of building ever-larger AI models to store more information, we can focus on building smarter tools and teaching AI how to use them effectively, which is a much more scalable approach.
Abstract
Tool-augmented language models, equipped with retrieval, memory, or external APIs, are reshaping AI, yet their theoretical advantages remain underexplored. In this paper, we address this question by demonstrating the benefits of in-tool learning (external retrieval) over in-weight learning (memorization) for factual recall. We show that the number of facts a model can memorize solely in its weights is fundamentally limited by its parameter count. In contrast, we prove that tool-use enables unbounded factual recall via a simple and efficient circuit construction. These results are validated in controlled experiments, where tool-using models consistently outperform memorizing ones. We further show that for pretrained large language models, teaching tool-use and general rules is more effective than finetuning facts into memory. Our work provides both a theoretical and empirical foundation, establishing why tool-augmented workflows are not just practical, but provably more scalable.