DiffTester: Accelerating Unit Test Generation for Diffusion LLMs via Repetitive Pattern
Lekang Yang, Yuetong Liu, Yitong Zhang, Jia Li
2025-10-06
Summary
This paper introduces a new framework called DiffTester that speeds up the process of automatically creating unit tests using diffusion language models, which are a newer type of AI. It focuses on making these AI-generated tests both fast to produce and high quality.
What's the problem?
Automatically generating unit tests is important for good software development, but current AI models often create tests slowly, one small piece at a time. Newer 'diffusion' models can generate more at once, making them faster, but increasing how much they generate at a time usually makes the tests less effective and accurate. There's a trade-off between speed and quality when using these diffusion models for testing.
What's the solution?
DiffTester tackles this problem by recognizing that many unit tests for the same piece of code often have similar structures. It analyzes the code's structure and then intelligently increases how many parts of the test are generated at once, but only when it's safe to do so without sacrificing the quality of the tests. It also expands existing testing benchmarks to include more programming languages like Java and C++.
Why it matters?
This work is important because it makes automated unit test generation much more efficient, allowing developers to find and fix bugs faster. DiffTester works with different AI models and programming languages, making it a practical solution for improving software quality and speeding up the development process.
Abstract
Software development relies heavily on extensive unit testing, which makes the efficiency of automated Unit Test Generation (UTG) particularly important. However, most existing LLMs generate test cases one token at a time in each forward pass, which leads to inefficient UTG. Recently, diffusion LLMs (dLLMs) have emerged, offering promising parallel generation capabilities and showing strong potential for efficient UTG. Despite this advantage, their application to UTG is still constrained by a clear trade-off between efficiency and test quality, since increasing the number of tokens generated in each step often causes a sharp decline in the quality of test cases. To overcome this limitation, we present DiffTester, an acceleration framework specifically tailored for dLLMs in UTG. The key idea of DiffTester is that unit tests targeting the same focal method often share repetitive structural patterns. By dynamically identifying these common patterns through abstract syntax tree analysis during generation, DiffTester adaptively increases the number of tokens produced at each step without compromising the quality of the output. To enable comprehensive evaluation, we extend the original TestEval benchmark, which was limited to Python, by introducing additional programming languages including Java and C++. Extensive experiments on three benchmarks with two representative models show that DiffTester delivers significant acceleration while preserving test coverage. Moreover, DiffTester generalizes well across different dLLMs and programming languages, providing a practical and scalable solution for efficient UTG in software development. Code and data are publicly available at https://github.com/wellbeingyang/DLM4UTG-open .