daVinci-Env: Open SWE Environment Synthesis at Scale

Dayuan Fu, Shenyu Wu, Yunze Wu, Zerui Peng, Yaxing Huang, Jie Sun, Ji Zeng, Mohan Jiang, Lin Zhang, Yukun Li, Jiarui Hu, Liming Liu, Jinlong Hou, Pengfei Liu

2026-03-16

daVinci-Env: Open SWE Environment Synthesis at Scale

Summary

This paper introduces OpenSWE, a large and openly available resource designed to help researchers develop and train artificial intelligence agents that can write and improve computer code.

What's the problem?

Currently, training these 'software engineering' (SWE) agents is difficult because there aren't enough large, realistic, and publicly accessible datasets. Existing datasets are either too small or don't represent the variety of real-world coding projects. Industrial datasets exist, but companies usually keep the details of how they were created secret, making it hard for academic researchers to learn from them or build upon their work.

What's the solution?

The researchers created OpenSWE, which includes over 45,000 different coding environments built from more than 12,000 software projects, all packaged in a way that allows them to be run and tested automatically. They used a system of automated tools and a large computer cluster to build and filter these environments, ensuring they are solvable and challenging enough to be useful for training AI agents. They also invested significant resources, around $1.47 million, to create high-quality training data, resulting in about 13,000 examples of successful code solutions.

Why it matters?

OpenSWE is important because it provides a standardized and accessible platform for SWE agent research. By releasing all the tools and data, it lowers the barrier to entry for researchers and allows for more collaboration and faster progress in the field. The experiments show that AI models trained on OpenSWE perform very well on coding tasks and even improve performance on other types of reasoning problems, like math and science, demonstrating the broad benefits of this research.

Abstract

Training capable software engineering (SWE) agents demands large-scale, executable, and verifiable environments that provide dynamic feedback loops for iterative code editing, test execution, and solution refinement. However, existing open-source datasets remain limited in scale and repository diversity, while industrial solutions are opaque with unreleased infrastructure, creating a prohibitive barrier for most academic research groups. We present OpenSWE, the largest fully transparent framework for SWE agent training in Python, comprising 45,320 executable Docker environments spanning over 12.8k repositories, with all Dockerfiles, evaluation scripts, and infrastructure fully open-sourced for reproducibility. OpenSWE is built through a multi-agent synthesis pipeline deployed across a 64-node distributed cluster, automating repository exploration, Dockerfile construction, evaluation script generation, and iterative test analysis. Beyond scale, we propose a quality-centric filtering pipeline that characterizes the inherent difficulty of each environment, filtering out instances that are either unsolvable or insufficiently challenging and retaining only those that maximize learning efficiency. With 891K spent on environment construction and an additional 576K on trajectory sampling and difficulty-aware curation, the entire project represents a total investment of approximately $1.47 million, yielding about 13,000 curated trajectories from roughly 9,000 quality guaranteed environments. Extensive experiments validate OpenSWE's effectiveness: OpenSWE-32B and OpenSWE-72B achieve 62.4% and 66.0% on SWE-bench Verified, establishing SOTA among Qwen2.5 series. Moreover, SWE-focused training yields substantial out-of-domain improvements, including up to 12 points on mathematical reasoning and 5 points on science benchmarks, without degrading factual recall.

View Paper