Seed-Prover 1.5: Mastering Undergraduate-Level Theorem Proving via Learning from Experience

Jiangjie Chen, Wenxiang Chen, Jiacheng Du, Jinyi Hu, Zhicheng Jiang, Allan Jie, Xiaoran Jin, Xing Jin, Chenggang Li, Wenlei Shi, Zhihong Wang, Mingxuan Wang, Chenrui Wei, Shufa Wei, Huajian Xin, Fan Yang, Weihao Gao, Zheng Yuan, Tianyang Zhan, Zeyu Zheng, Tianxi Zhou, Thomas Hanwen Zhu

2025-12-22

Seed-Prover 1.5: Mastering Undergraduate-Level Theorem Proving via Learning from Experience

Summary

This paper introduces Seed-Prover 1.5, a new artificial intelligence system designed to automatically prove mathematical theorems, particularly those found in college and graduate-level courses.

What's the problem?

While large language models are getting good at *writing out* mathematical proofs in a way humans can understand, getting them to work with formal proof languages like Lean is really hard and takes a lot of computing power. Formal languages require absolute precision, and even small errors can prevent a proof from being accepted. Existing systems struggle with the complexity of undergraduate and advanced math problems.

What's the solution?

The researchers created Seed-Prover 1.5, which learns by repeatedly trying to solve problems and getting feedback from the Lean system. This is done through a process called 'reinforcement learning,' where the AI is rewarded for making progress and penalized for errors. They also developed a way to translate between regular English and the formal language Lean, making it easier for the AI to understand and work with the problems. The system gets better over time by building on its past experiences.

Why it matters?

Seed-Prover 1.5 is a significant step forward because it can solve a much higher percentage of challenging math problems than previous systems, and it does so using less computing power. It successfully solved a large portion of problems from Putnam and Fate benchmarks, even tackling some PhD-level questions. This suggests that AI has the potential to become a powerful tool for mathematicians and could eventually automate a significant part of the theorem-proving process.

Abstract

Large language models have recently made significant progress to generate rigorous mathematical proofs. In contrast, utilizing LLMs for theorem proving in formal languages (such as Lean) remains challenging and computationally expensive, particularly when addressing problems at the undergraduate level and beyond. In this work, we present Seed-Prover 1.5, a formal theorem-proving model trained via large-scale agentic reinforcement learning, alongside an efficient test-time scaling (TTS) workflow. Through extensive interactions with Lean and other tools, the model continuously accumulates experience during the RL process, substantially enhancing the capability and efficiency of formal theorem proving. Furthermore, leveraging recent advancements in natural language proving, our TTS workflow efficiently bridges the gap between natural and formal languages. Compared to state-of-the-art methods, Seed-Prover 1.5 achieves superior performance with a smaller compute budget. It solves 88\% of PutnamBench (undergraduate-level), 80\% of Fate-H (graduate-level), and 33\% of Fate-X (PhD-level) problems. Notably, using our system, we solved 11 out of 12 problems from Putnam 2025 within 9 hours. Our findings suggest that scaling learning from experience, driven by high-quality formal feedback, holds immense potential for the future of formal mathematical reasoning.

View Paper