SoS1: O1 and R1-Like Reasoning LLMs are Sum-of-Square Solvers

Kechen Li, Wenqi Zhu, Coralia Cartis, Tianbo Ji, Shiwei Liu

2025-03-03

SoS1: O1 and R1-Like Reasoning LLMs are Sum-of-Square Solvers

Summary

This paper talks about teaching AI language models to solve a tough math problem called determining whether a polynomial is nonnegative, which is related to Hilbert's Seventeenth Problem. The researchers created a special dataset and instructions to help the AI models get better at this task.

What's the problem?

Large Language Models (LLMs) are really good at many tasks, but they struggle with complex math problems. Specifically, they have a hard time figuring out if a polynomial (a type of math expression) is always greater than or equal to zero, which is an important but very difficult problem in mathematics.

What's the solution?

The researchers did three main things. First, they made a dataset called SoS-1K with about 1,000 carefully chosen polynomials. Second, they created special instructions to guide the AI in solving these problems. Finally, they trained their own AI model, called SoS-7B, using this dataset. They found that with good instructions, AI models got much better at solving these problems, and their SoS-7B model did even better than some much larger and more powerful models.

Why it matters?

This matters because it shows that AI can potentially solve really hard math problems that even computers struggle with. It's not just about math though - being able to solve these kinds of problems could help in many areas like engineering, science, and technology. Plus, the fact that a smaller, more efficient AI model could outperform bigger ones suggests we might be able to make powerful AI tools that don't need as much computing power, making them more accessible and practical to use.

Abstract

Large Language Models (LLMs) have achieved human-level proficiency across diverse tasks, but their ability to perform rigorous mathematical problem solving remains an open challenge. In this work, we investigate a fundamental yet computationally intractable problem: determining whether a given multivariate polynomial is nonnegative. This problem, closely related to Hilbert's Seventeenth Problem, plays a crucial role in global polynomial optimization and has applications in various fields. First, we introduce SoS-1K, a meticulously curated dataset of approximately 1,000 polynomials, along with expert-designed reasoning instructions based on five progressively challenging criteria. Evaluating multiple state-of-the-art LLMs, we find that without structured guidance, all models perform only slightly above the random guess baseline 50%. However, high-quality reasoning instructions significantly improve accuracy, boosting performance up to 81%. Furthermore, our 7B model, SoS-7B, fine-tuned on SoS-1K for just 4 hours, outperforms the 671B DeepSeek-V3 and GPT-4o-mini in accuracy while only requiring 1.8% and 5% of the computation time needed for letters, respectively. Our findings highlight the potential of LLMs to push the boundaries of mathematical reasoning and tackle NP-hard problems.

View Paper