MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling

MiroMind Team, Song Bai, Lidong Bing, Carson Chen, Guanzheng Chen, Yuntao Chen, Zhe Chen, Ziyi Chen, Jifeng Dai, Xuan Dong, Yue Deng, Yunjie Fu, Junqi Ge, Chenxia Han, Tammy Huang, Zhenhang Huang, Jerry Jiao, Shilei Jiang, Tianyu Jiao, Xiaoqi Jian, Lei Lei, Ruilin Li

2025-11-18

MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling

Summary

This paper introduces MiroThinker, a new open-source computer program designed to be a helpful research assistant. It's built to be really good at using tools and finding information to solve complex problems.

What's the problem?

Existing AI research assistants often try to improve by simply making the underlying AI model bigger or giving it more memory to work with. However, this approach has limitations. Just increasing size doesn't always lead to better reasoning, and longer chains of thought can sometimes lead to more errors. The problem is that these assistants don't effectively *learn* from their interactions with the environment and the information they find.

What's the solution?

The creators of MiroThinker took a different approach. Instead of just focusing on model size, they focused on how the AI interacts with its environment – things like searching the internet or using other tools. They used a technique called reinforcement learning to train the AI to have more frequent and deeper interactions. This means the AI learns to ask for help, check its work, and refine its approach as it goes. MiroThinker can make up to 600 tool calls during a single task, allowing it to really dig into a problem. They tested it on several challenging research tasks and it performed very well, even rivaling some commercial AI systems.

Why it matters?

This research shows that improving how an AI interacts with its environment is just as important as making the AI itself bigger or giving it more memory. It establishes 'interaction scaling' as a key area for future development in AI research assistants, meaning that building AI that can effectively learn and adapt through interaction is crucial for creating truly powerful tools.

Abstract

We present MiroThinker v1.0, an open-source research agent designed to advance tool-augmented reasoning and information-seeking capabilities. Unlike previous agents that only scale up model size or context length, MiroThinker explores interaction scaling at the model level, systematically training the model to handle deeper and more frequent agent-environment interactions as a third dimension of performance improvement. Unlike LLM test-time scaling, which operates in isolation and risks degradation with longer reasoning chains, interactive scaling leverages environment feedback and external information acquisition to correct errors and refine trajectories. Through reinforcement learning, the model achieves efficient interaction scaling: with a 256K context window, it can perform up to 600 tool calls per task, enabling sustained multi-turn reasoning and complex real-world research workflows. Across four representative benchmarks-GAIA, HLE, BrowseComp, and BrowseComp-ZH-the 72B variant achieves up to 81.9%, 37.7%, 47.1%, and 55.6% accuracy respectively, surpassing previous open-source agents and approaching commercial counterparts such as GPT-5-high. Our analysis reveals that MiroThinker benefits from interactive scaling consistently: research performance improves predictably as the model engages in deeper and more frequent agent-environment interactions, demonstrating that interaction depth exhibits scaling behaviors analogous to model size and context length. These findings establish interaction scaling as a third critical dimension for building next-generation open research agents, complementing model capacity and context windows.

View Paper