OpenDevin: An Open Platform for AI Software Developers as Generalist Agents
Xingyao Wang, Boxuan Li, Yufan Song, Frank F. Xu, Xiangru Tang, Mingchen Zhuge, Jiayi Pan, Yueqi Song, Bowen Li, Jaskirat Singh, Hoang H. Tran, Fuqiang Li, Ren Ma, Mingzhang Zheng, Bill Qian, Yanjun Shao, Niklas Muennighoff, Yizhe Zhang, Binyuan Hui, Junyang Lin, Robert Brennan, Hao Peng
2024-07-25

Summary
This paper introduces OpenDevin, a new platform designed for developing AI agents that can perform software development tasks similar to human programmers. It allows these agents to write code, interact with command lines, and browse the web effectively.
What's the problem?
While AI has made significant advances in many areas, there are still challenges in creating AI agents that can handle complex software development tasks. Existing systems often lack the ability to work flexibly and safely in real-world environments, making it difficult for them to assist developers effectively. Additionally, evaluating the performance of these AI agents can be complicated.
What's the solution?
OpenDevin addresses these challenges by providing a structured platform where developers can create and test AI agents. It includes features for safe code execution in controlled environments (sandboxing), coordination between multiple agents, and a set of evaluation benchmarks to assess their performance. The platform has been tested on various tasks related to software engineering and web browsing, demonstrating its effectiveness in real-world scenarios.
Why it matters?
This research is important because it enhances the capabilities of AI in software development, making it easier for developers to create and deploy powerful tools. By improving how AI agents interact with programming tasks, OpenDevin can help streamline the software development process, ultimately leading to more efficient coding practices and innovations in technology.
Abstract
Software is one of the most powerful tools that we humans have at our disposal; it allows a skilled programmer to interact with the world in complex and profound ways. At the same time, thanks to improvements in large language models (LLMs), there has also been a rapid development in AI agents that interact with and affect change in their surrounding environments. In this paper, we introduce OpenDevin, a platform for the development of powerful and flexible AI agents that interact with the world in similar ways to those of a human developer: by writing code, interacting with a command line, and browsing the web. We describe how the platform allows for the implementation of new agents, safe interaction with sandboxed environments for code execution, coordination between multiple agents, and incorporation of evaluation benchmarks. Based on our currently incorporated benchmarks, we perform an evaluation of agents over 15 challenging tasks, including software engineering (e.g., SWE-Bench) and web browsing (e.g., WebArena), among others. Released under the permissive MIT license, OpenDevin is a community project spanning academia and industry with more than 1.3K contributions from over 160 contributors and will improve going forward.