Learning to Predict Program Execution by Modeling Dynamic Dependency on Code Graphs

Cuong Chi Le, Hoang Nhat Phan, Huy Nhat Phan, Tien N. Nguyen, Nghi D. Q. Bui

2024-08-09

Learning to Predict Program Execution by Modeling Dynamic Dependency on Code Graphs

Summary

This paper introduces a new method for predicting how programs will run without actually executing them, using a technique called Dynamic Dependencies Learning to model the relationships between different parts of code.

What's the problem?

Understanding how a program will behave during execution is crucial for software development. However, traditional methods often fail to capture the complex interactions between different parts of the code, making it hard to predict issues like runtime errors or how much of the code will be executed (code coverage). This can lead to bugs and inefficient coding practices.

What's the solution?

The authors present a framework called CodeFlow that uses control flow graphs (CFGs) to represent all possible paths a program can take during execution. By analyzing these graphs, the model learns both static dependencies (how code is structured) and dynamic dependencies (how different parts of the code affect each other during execution). This allows for more accurate predictions about code coverage and helps identify potential runtime errors before running the program.

Why it matters?

This research is important because it improves how developers can anticipate problems in their code, leading to better software quality and efficiency. By enabling more accurate predictions about program behavior, this method can help reduce debugging time and improve overall programming practices.

Abstract

Predicting program behavior without execution is an essential and challenging task in software engineering. Traditional models often struggle to capture dynamic dependencies and interactions within code. This paper introduces a novel machine learning-based framework called CodeFlowrepresents, which predicts code coverage and detects runtime errors through Dynamic Dependencies Learning. Utilizing control flow graphs (CFGs), CodeFlowrepresents all possible execution paths and the relationships between different statements, offering a comprehensive understanding of program behavior. It constructs CFGs to depict execution paths and learns vector representations for CFG nodes, capturing static control-flow dependencies. Additionally, it learns dynamic dependencies through execution traces, which reflect the impacts among statements during execution. This approach enables accurate prediction of code coverage and identification of runtime errors. Empirical evaluations show significant improvements in code coverage prediction accuracy and effective localization of runtime errors, surpassing current models.

View Paper