Beyond Theorem Proving: Formulation, Framework and Benchmark for Formal Problem-Solving

Qi Liu, Xinhao Zheng, Renqiu Xia, Xingzhi Qi, Qinxiang Cao, Junchi Yan

2025-05-08

Beyond Theorem Proving: Formulation, Framework and Benchmark for Formal
Problem-Solving

Summary

This paper talks about a new way for AI to solve formal problems, which are problems with strict rules and logic, not just proving theorems but tackling a wider range of challenges. The researchers introduce a system called FPS that helps AI approach these problems in a more organized and reliable way.

What's the problem?

The problem is that most AI systems that deal with formal logic are mainly focused on proving mathematical theorems, which is only one type of formal problem. There hasn't been a good general framework for AI to handle other kinds of formal problem-solving, or a way to fairly test and compare different approaches.

What's the solution?

The researchers created the FPS framework, which uses special environments called FTP and a verification process called RPE to help AI solve and check formal problems. They also designed benchmarks to test how well different AI systems perform using this new setup.

Why it matters?

This matters because it opens up new possibilities for AI to help in areas that require strict logical thinking, like computer science, engineering, and mathematics. By providing a better way to train and test AI on these problems, researchers can develop smarter systems that are more useful for solving real-world challenges that depend on formal logic.

Abstract

A new formal framework, FPS, for AI-based problem-solving is introduced, utilizing FTP environments and RPE for verification, with benchmarks and evaluation.

View Paper