FunReason-MT Technical Report: Overcoming the Complexity Barrier in Multi-Turn Function Calling

Zengzhuang Xu, Bingguang Hao, Zechuan Wang, Yuntao Wen, Maolin Wang, Yang Liu, Long Chen, Dong Wang, Yicheng Chen, Cunyin Peng, Chenyi Zhuang, Jinjie Gu, Leilei Gan, Xiangyu Zhao, Shi Gu

2025-10-29

FunReason-MT Technical Report: Overcoming the Complexity Barrier in Multi-Turn Function Calling

Summary

This paper introduces a new method, called FunReason-MT, for creating training data that helps large language models (LLMs) learn how to use tools and interact with the real world. It's about making AI systems better at solving complex tasks by letting them use things like calculators, search engines, or other software.

What's the problem?

Currently, it's hard to create good training data for LLMs to learn function calling – that is, using tools. Existing methods for generating this data aren't good enough for realistic situations. Specifically, it's difficult to train models to use tools effectively over multiple steps, to separate out the tool's specific workings from the LLM's reasoning, and to ensure the steps in a tool-using process logically follow each other.

What's the solution?

FunReason-MT tackles these problems in three main ways. First, it maps out how the LLM can interact with different tools, gathering lots of different examples of tool use. Second, it simplifies the process of creating the specific requests the LLM sends to the tools. Finally, it uses a guided process to generate detailed, step-by-step reasoning for each action the LLM takes, making sure each step makes sense in the context of the overall task.

Why it matters?

This research is important because it significantly improves the performance of LLMs when it comes to using tools. A model trained with data generated by FunReason-MT performed better than many other similar-sized models, and even outperformed some closed-source (proprietary) models on standard benchmarks. This means we're getting closer to AI systems that can reliably and effectively solve real-world problems by using the tools available to them.

Abstract

Function calling (FC) empowers large language models (LLMs) and autonomous agents to interface with external tools, a critical capability for solving complex, real-world problems. As this ability becomes increasingly central to advanced AI systems, the need for high-quality, multi-turn training data to develop and refine it cannot be overstated. Existing data synthesis methods, such as random environment sampling or multi-agent role-playing, are not powerful enough to generate high-quality data in real-world environments. Practical challenges come in three folds: targeted model training, isolation of tool architecture, and multi-turn logical dependency. To address these structural deficiencies, we present FunReason-MT, a novel data synthesis framework for real-world multi-turn tool use. FunReason-MT resolves the complexity barrier in multi-turn FC data by employing 1) Environment-API Graph Interactions to gather varied high-quality trajectories, 2) Advanced Tool-Query Synthesis to simplify hard query construction, and 3) Guided Iterative Chain for sophisticated CoT generation. Evaluations on Berkeley Function-Calling Leaderboard (BFCLv3) demonstrate the power of our framework: a 4B model built upon FunReason-MT generated data achieves state-of-the-art performance among comparable-sized models, outperforming most close-source models. Further performance improvements on BFCLv4 confirm that FunReason-MT provides a reliable and robust source for agentic learning.

View Paper