Learning to Configure Agentic AI Systems

Aditya Taparia, Som Sagar, Ransalu Senanayake

2026-02-17

Learning to Configure Agentic AI Systems

Summary

This paper introduces a new way to automatically configure AI agent systems, which are programs that use large language models (LLMs) to solve complex tasks. Instead of relying on pre-set configurations, it learns to adjust settings on a case-by-case basis.

What's the problem?

Currently, setting up these AI agents is difficult because there are many choices to make – like what order to perform tasks, which tools to use, how much processing power to allow, and how to phrase instructions. People usually use fixed templates or rules they’ve manually tuned, but these don’t work well for all situations. They can be inefficient, wasting computing resources on simple tasks, and unreliable when faced with challenging problems.

What's the solution?

The researchers developed a system called ARC, which uses reinforcement learning – a type of machine learning where an agent learns by trial and error – to dynamically adjust the agent’s configuration for each new question or task. ARC learns a 'policy' that decides the best settings based on the specific input it receives, essentially tailoring the agent's approach to the problem at hand.

Why it matters?

This is important because it shows that we can create more effective and efficient AI agents by letting them adapt to the task instead of forcing them to use a one-size-fits-all approach. ARC consistently performed better than existing methods, achieving higher accuracy and reducing the amount of computing power needed, which could lead to significant cost savings and improved performance in real-world applications.

Abstract

Configuring LLM-based agent systems involves choosing workflows, tools, token budgets, and prompts from a large combinatorial design space, and is typically handled today by fixed large templates or hand-tuned heuristics. This leads to brittle behavior and unnecessary compute, since the same cumbersome configuration is often applied to both easy and hard input queries. We formulate agent configuration as a query-wise decision problem and introduce ARC (Agentic Resource & Configuration learner), which learns a light-weight hierarchical policy using reinforcement learning to dynamically tailor these configurations. Across multiple benchmarks spanning reasoning and tool-augmented question answering, the learned policy consistently outperforms strong hand-designed and other baselines, achieving up to 25% higher task accuracy while also reducing token and runtime costs. These results demonstrate that learning per-query agent configurations is a powerful alternative to "one size fits all" designs.

View Paper