Experience-Guided Adaptation of Inference-Time Reasoning Strategies
Adam Stein, Matthew Trager, Benjamin Bowman, Michael Kleinman, Aditya Chattopadhyay, Wei Xia, Stefano Soatto
2025-11-17
Summary
This paper introduces a new AI system called Experience-Guided Reasoner, or EGuR, that can learn and improve how it solves problems *while* it's being used, without needing to be retrained from scratch.
What's the problem?
Current AI systems that can learn from experience are limited. Some can only adjust by changing the text they use, meaning they can't change fundamental settings like how creative their answers are, what tools they have access to, or even how they approach a problem overall. Other systems that *can* make bigger changes need a lot of preparation and then can't adapt once they're running. Essentially, it's hard to build an AI that's both flexible and efficient at learning on the fly.
What's the solution?
EGuR solves this by creating 'strategies' – essentially step-by-step plans for solving problems – dynamically. It uses another AI, a 'meta-strategy,' to generate these plans, which include everything from the wording of prompts to the tools used and how the AI decides what to do next. A 'Guide' creates different strategy options based on past experiences, and a 'Consolidator' learns from the results of using those strategies to make even better ones in the future. These strategies are then saved and reused, saving time and computing power.
Why it matters?
This is important because it allows AI systems to become much more adaptable and efficient. EGuR demonstrated significant improvements in accuracy and speed on several challenging tasks, and these improvements continued as the system gained more experience. This means we can build AI that gets better at solving problems over time, without constant human intervention or expensive retraining, opening the door to more powerful and practical AI applications.
Abstract
Enabling agentic AI systems to adapt their problem-solving approaches based on post-training interactions remains a fundamental challenge. While systems that update and maintain a memory at inference time have been proposed, existing designs only steer the system by modifying textual input to a language model or agent, which means that they cannot change sampling parameters, remove tools, modify system prompts, or switch between agentic and workflow paradigms. On the other hand, systems that adapt more flexibly require offline optimization and remain static once deployed. We present Experience-Guided Reasoner (EGuR), which generates tailored strategies -- complete computational procedures involving LLM calls, tools, sampling parameters, and control logic -- dynamically at inference time based on accumulated experience. We achieve this using an LLM-based meta-strategy -- a strategy that outputs strategies -- enabling adaptation of all strategy components (prompts, sampling parameters, tool configurations, and control logic). EGuR operates through two components: a Guide generates multiple candidate strategies conditioned on the current problem and structured memory of past experiences, while a Consolidator integrates execution feedback to improve future strategy generation. This produces complete, ready-to-run strategies optimized for each problem, which can be cached, retrieved, and executed as needed without wasting resources. Across five challenging benchmarks (AIME 2025, 3-SAT, and three Big Bench Extra Hard tasks), EGuR achieves up to 14% accuracy improvements over the strongest baselines while reducing computational costs by up to 111x, with both metrics improving as the system gains experience.