Atlas: Orchestrating Heterogeneous Models and Tools for Multi-Domain Complex Reasoning
Jinyang Wu, Guocheng Zhai, Ruihan Jin, Jiahao Yuan, Yuhao Shen, Shuai Zhang, Zhengqi Wen, Jianhua Tao
2026-01-08
Summary
This paper introduces a new system called ATLAS that helps AI agents make better decisions about which tools and large language models (LLMs) to use together to solve complex problems.
What's the problem?
As we create more and more powerful LLMs and tools for them to use, figuring out the *best* combination for a specific task becomes incredibly difficult. It's like having a huge toolbox and trying to find the exact right tools for a job – there are too many options! Current methods often stick with one model or a pre-set way of using tools, which doesn't take advantage of the fact that different combinations work better for different situations.
What's the solution?
ATLAS tackles this problem with a two-part approach. First, it quickly groups similar tools and LLMs based on what they're good at, without needing extra training. This is like organizing your toolbox into sections. Second, it uses a learning process called reinforcement learning to let the AI agent experiment and learn the best sequence of tools and models to use for tasks it hasn't seen before. It essentially teaches itself how to combine things effectively.
Why it matters?
This research is important because ATLAS significantly outperforms even very advanced AI systems like GPT-4o on a variety of challenging tasks. It's especially good at tasks that require understanding both text and images, showing that it can effectively coordinate specialized tools. This means we can build AI agents that are more adaptable, accurate, and capable of solving real-world problems.
Abstract
The integration of large language models (LLMs) with external tools has significantly expanded the capabilities of AI agents. However, as the diversity of both LLMs and tools increases, selecting the optimal model-tool combination becomes a high-dimensional optimization challenge. Existing approaches often rely on a single model or fixed tool-calling logic, failing to exploit the performance variations across heterogeneous model-tool pairs. In this paper, we present ATLAS (Adaptive Tool-LLM Alignment and Synergistic Invocation), a dual-path framework for dynamic tool usage in cross-domain complex reasoning. ATLAS operates via a dual-path approach: (1) training-free cluster-based routing that exploits empirical priors for domain-specific alignment, and (2) RL-based multi-step routing that explores autonomous trajectories for out-of-distribution generalization. Extensive experiments across 15 benchmarks demonstrate that our method outperforms closed-source models like GPT-4o, surpassing existing routing methods on both in-distribution (+10.1%) and out-of-distribution (+13.1%) tasks. Furthermore, our framework shows significant gains in visual reasoning by orchestrating specialized multi-modal tools.