SkillOrchestra: Learning to Route Agents via Skill Transfer

Jiayu Wang, Yifei Ming, Zixuan Ke, Shafiq Joty, Aws Albarghouthi, Frederic Sala

2026-02-24

SkillOrchestra: Learning to Route Agents via Skill Transfer

Summary

This paper introduces a new system called SkillOrchestra designed to better manage and combine the strengths of multiple AI models working together, aiming for more powerful overall performance.

What's the problem?

Currently, systems that try to intelligently route tasks to different AI models struggle with two main issues. First, they often make decisions about which model to use for an entire request at once, without considering how the task might change as it progresses. Second, training these routing systems using reinforcement learning is very expensive and can lead to the system always picking the same, potentially costly, model instead of exploring other options.

What's the solution?

SkillOrchestra takes a different approach. Instead of trying to learn a complete routing strategy directly, it focuses on identifying specific 'skills' each AI model excels at. It then learns how well each model performs these skills and how much each skill costs to use. When a new task comes in, SkillOrchestra figures out which skills are needed and chooses the models that can best deliver those skills, balancing performance and cost.

Why it matters?

This research is important because it offers a way to build more effective and efficient compound AI systems. SkillOrchestra significantly outperforms existing methods, requiring much less training data and computational resources while also being more adaptable and easier to understand, paving the way for more practical and scalable AI solutions.

Abstract

Compound AI systems promise capabilities beyond those of individual models, yet their success depends critically on effective orchestration. Existing routing approaches face two limitations: (1) input-level routers make coarse query-level decisions that ignore evolving task requirements; (2) RL-trained orchestrators are expensive to adapt and often suffer from routing collapse, repeatedly invoking one strong but costly option in multi-turn scenarios. We introduce SkillOrchestra, a framework for skill-aware orchestration. Instead of directly learning a routing policy end-to-end, SkillOrchestra learns fine-grained skills from execution experience and models agent-specific competence and cost under those skills. At deployment, the orchestrator infers the skill demands of the current interaction and selects agents that best satisfy them under an explicit performance-cost trade-off. Extensive experiments across ten benchmarks demonstrate that SkillOrchestra outperforms SoTA RL-based orchestrators by up to 22.5% with 700x and 300x learning cost reduction compared to Router-R1 and ToolOrchestra, respectively. These results show that explicit skill modeling enables scalable, interpretable, and sample-efficient orchestration, offering a principled alternative to data-intensive RL-based approaches. The code is available at: https://github.com/jiayuww/SkillOrchestra.

View Paper