< Explain other AI papers

SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarks

Lianghong Guo, Yanlin Wang, Caihua Li, Pengyu Yang, Jiachi Chen, Wei Tao, Yingtian Zou, Duyu Tang, Zibin Zheng

2025-06-15

SWE-Factory: Your Automated Factory for Issue Resolution Training Data
  and Evaluation Benchmarks

Summary

SWE-Factory is an automatic system that creates and checks large datasets from GitHub issues to help train and test large language models for solving software problems. It uses a special setup tool called SWE-Builder to create testing environments, a grading system based on exit codes to judge results, and automatic validation to make sure tasks are done correctly.

What's the problem?

Making big datasets for training AI to fix software problems is very hard and takes a lot of work, especially setting up the environment to test code fixes, grading the test results, and checking if everything works properly. These steps usually need a lot of manual effort.

What's the solution?

The researchers built SWE-Factory to automate these difficult steps. SWE-Builder, a group of AI agents, works together to set up testing environments, gather necessary data, and prepare tests. They use exit codes, which are simple signals from test results, to automatically grade outcomes without writing complex code. They also automate a fail-to-pass check to validate the tasks fully. This approach speeds up making large, accurate datasets efficiently.

Why it matters?

SWE-Factory matters because it makes creating and verifying big, high-quality datasets for training AI to fix software issues much faster and easier. This helps improve the ability of AI models to understand and solve real-world programming problems more reliably.

Abstract

A pipeline named SWE-Factory automates the creation and validation of GitHub issue resolution datasets for training and evaluating Large Language Models, using SWE-Builder for environment setup, exit-code-based grading, and automated fail2pass validation.