Live-SWE-agent: Can Software Engineering Agents Self-Evolve on the Fly?
Chunqiu Steven Xia, Zhe Wang, Yan Yang, Yuxiang Wei, Lingming Zhang
2025-11-18
Summary
This paper introduces a new type of AI agent, called Live-SWE-agent, designed to write and fix software code. It's a significant step forward because it can improve its own abilities while *actively* working on problems, unlike previous agents that needed extensive training beforehand.
What's the problem?
Currently, AI agents for software engineering are difficult to create. Designing a good agent requires a lot of trial and error, and even the best ones often aren't perfect. Existing 'self-improving' agents still need a lot of pre-training on specific tasks and don't always work well when faced with new types of coding challenges or different AI models. Essentially, building a truly adaptable and effective coding AI is a huge challenge.
What's the solution?
The researchers created Live-SWE-agent, which starts with very basic tools – just the ability to use simple computer commands. As it tries to solve coding problems, it *automatically* adds new tools and improves its own internal structure. It learns and evolves 'on the fly' while actually doing the work, rather than needing separate training phases. They tested it on standard software engineering benchmarks and found it performed exceptionally well.
Why it matters?
This research is important because it shows a path towards creating AI agents that are much more flexible and capable in software development. By allowing the agent to continuously learn and adapt during runtime, it avoids the limitations of pre-training and manual design. This could lead to AI tools that can handle a wider range of coding tasks and assist programmers more effectively, potentially even automating significant parts of the software creation process.
Abstract
Large Language Models (LLMs) are reshaping almost all industries, including software engineering. In recent years, a number of LLM agents have been proposed to solve real-world software problems. Such software agents are typically equipped with a suite of coding tools and can autonomously decide the next actions to form complete trajectories to solve end-to-end software tasks. While promising, they typically require dedicated design and may still be suboptimal, since it can be extremely challenging and costly to exhaust the entire agent scaffold design space. Recognizing that software agents are inherently software themselves that can be further refined/modified, researchers have proposed a number of self-improving software agents recently, including the Darwin-Gödel Machine (DGM). Meanwhile, such self-improving agents require costly offline training on specific benchmarks and may not generalize well across different LLMs or benchmarks. In this paper, we propose Live-SWE-agent, the first live software agent that can autonomously and continuously evolve itself on-the-fly during runtime when solving real-world software problems. More specifically, Live-SWE-agent starts with the most basic agent scaffold with only access to bash tools (e.g., mini-SWE-agent), and autonomously evolves its own scaffold implementation while solving real-world software problems. Our evaluation on the widely studied SWE-bench Verified benchmark shows that Live-SWE-agent can achieve an impressive solve rate of 75.4% without test-time scaling, outperforming all existing open-source software agents and approaching the performance of the best proprietary solution. Moreover, Live-SWE-agent outperforms state-of-the-art manually crafted software agents on the recent SWE-Bench Pro benchmark, achieving the best-known solve rate of 45.8%.