Re:Form -- Reducing Human Priors in Scalable Formal Software Verification with RL in LLMs: A Preliminary Study on Dafny

Chuanhao Yan, Fengdi Che, Xuhan Huang, Xu Xu, Xin Li, Yizhi Li, Xingwei Qu, Jingzhe Shi, Zhuangzhuang He, Chenghua Lin, Yaodong Yang, Binhang Yuan, Hang Zhao, Yu Qiao, Bowen Zhou, Jie Fu

2025-07-24

Re:Form -- Reducing Human Priors in Scalable Formal Software
Verification with RL in LLMs: A Preliminary Study on Dafny

Summary

This paper talks about how using formal language-based reasoning and automatic verification with a programming language called Dafny can help large language models create code that can be checked for correctness.

What's the problem?

Writing reliable software is hard because programs often have bugs or mistakes that are difficult to find and fix, especially as the code gets bigger and more complex.

What's the solution?

The researchers used Dafny, a programming language designed to write programs with built-in rules and proofs that the code works as intended. They combined this with reinforcement learning in large language models to reduce the need for human-designed rules, allowing the models to automatically generate code that can be formally verified for correctness.

Why it matters?

This matters because it helps make software development more trustworthy and scalable by ensuring that code produced by AI is correct, reducing bugs and errors in real-world applications.

Abstract

Formal language-based reasoning and automatic verification improve the reliability and scalability of Large Language Models in generating verifiable code.

View Paper