SWE-Master: Unleashing the Potential of Software Engineering Agents via Post-Training
Huatong Song, Lisheng Huang, Shuang Sun, Jinhao Jiang, Ran Le, Daixuan Cheng, Guoxin Chen, Yiwen Hu, Zongchao Chen, Wayne Xin Zhao, Yang Song, Tao Zhang, Ji-Rong Wen
2026-02-04
Summary
This paper introduces SWE-Master, a new, openly available system designed to build and train AI agents that can perform software engineering tasks, like writing and debugging code.
What's the problem?
Currently, creating AI agents capable of complex software engineering is difficult. Existing agents often struggle with tasks that require planning and executing multiple steps over a long period, and it's hard to reproduce results because the process isn't clearly defined or shared. There's a need for a standardized, transparent way to develop and improve these agents.
What's the solution?
The researchers created SWE-Master, which systematically addresses every step of building a software engineering agent. This includes creating training data, initially teaching the agent with examples, refining it using a reward system based on actual code execution, and designing how the agent makes decisions. They started with a basic AI model and improved its software engineering skills through this structured process, then tested it on a standard set of software tasks called SWE-bench Verified.
Why it matters?
SWE-Master significantly improves the performance of open-source software engineering agents, achieving a high success rate on challenging tasks. Importantly, it provides a complete and reproducible framework, meaning other researchers can easily build upon this work and accelerate progress in the field of AI-assisted software development. This makes it easier to create more reliable and helpful AI tools for programmers.
Abstract
In this technical report, we present SWE-Master, an open-source and fully reproducible post-training framework for building effective software engineering agents. SWE-Master systematically explores the complete agent development pipeline, including teacher-trajectory synthesis and data curation, long-horizon SFT, RL with real execution feedback, and inference framework design. Starting from an open-source base model with limited initial SWE capability, SWE-Master demonstrates how systematical optimization method can elicit strong long-horizon SWE task solving abilities. We evaluate SWE-Master on SWE-bench Verified, a standard benchmark for realistic software engineering tasks. Under identical experimental settings, our approach achieves a resolve rate of 61.4\% with Qwen2.5-Coder-32B, substantially outperforming existing open-source baselines. By further incorporating test-time scaling~(TTS) with LLM-based environment feedback, SWE-Master reaches 70.8\% at TTS@8, demonstrating a strong performance potential. SWE-Master provides a practical and transparent foundation for advancing reproducible research on software engineering agents. The code is available at https://github.com/RUCAIBox/SWE-Master.