< Explain other AI papers

HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks at Scale

Huy Nhat Phan, Phong X. Nguyen, Nghi D. Q. Bui

2024-09-26

HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks at Scale

Summary

This paper introduces HyperAgent, a new system designed to help solve various coding tasks in software engineering using multiple specialized agents. It mimics the way human developers work and can handle a wide range of programming challenges across different languages.

What's the problem?

While large language models (LLMs) have improved coding capabilities, most existing software agents are focused on specific tasks and cannot adapt to different types of software engineering challenges. This limits their usefulness in real-world scenarios where developers need to tackle a variety of tasks, such as fixing bugs or generating new code.

What's the solution?

To solve this problem, the researchers created HyperAgent, which consists of four specialized agents: a Planner for organizing tasks, a Navigator for understanding project structures, a Code Editor for writing code, and an Executor for running and verifying the code. This multi-agent system works together to manage the entire process of software development, from planning to execution, making it more versatile than previous models. The researchers tested HyperAgent on several benchmarks and found that it outperformed existing methods in resolving GitHub issues and generating code.

Why it matters?

This research is important because it represents a significant step towards creating intelligent software agents that can assist developers in various coding tasks. By mimicking human workflows and being able to adapt to different programming languages and challenges, HyperAgent has the potential to improve productivity in software development, making it easier for developers to create and maintain high-quality software.

Abstract

Large Language Models (LLMs) have revolutionized software engineering (SE), demonstrating remarkable capabilities in various coding tasks. While recent efforts have produced autonomous software agents based on LLMs for end-to-end development tasks, these systems are typically designed for specific SE tasks. We introduce HyperAgent, a novel generalist multi-agent system designed to address a wide spectrum of SE tasks across different programming languages by mimicking human developers' workflows. Comprising four specialized agents - Planner, Navigator, Code Editor, and Executor. HyperAgent manages the full lifecycle of SE tasks, from initial conception to final verification. Through extensive evaluations, HyperAgent achieves state-of-the-art performance across diverse SE tasks: it attains a 25.01% success rate on SWE-Bench-Lite and 31.40% on SWE-Bench-Verified for GitHub issue resolution, surpassing existing methods. Furthermore, HyperAgent demonstrates SOTA performance in repository-level code generation (RepoExec), and in fault localization and program repair (Defects4J), often outperforming specialized systems. This work represents a significant advancement towards versatile, autonomous agents capable of handling complex, multi-step SE tasks across various domains and languages, potentially transforming AI-assisted software development practices.