Towards Autonomous Mathematics Research
Tony Feng, Trieu H. Trinh, Garrett Bingham, Dawsen Hwang, Yuri Chervonyi, Junehyuk Jung, Joonkyung Lee, Carlo Pagano, Sang-hyun Kim, Federico Pasqualotto, Sergei Gukov, Jonathan N. Lee, Junsu Kim, Kaiying Hou, Golnaz Ghiasi, Yi Tay, YaGuang Li, Chenkai Kuang, Yuan Liu, Hanzhao, Lin, Evan Zheran Liu
2026-02-12
Summary
This paper introduces Aletheia, an AI agent designed to do mathematical research, going beyond just solving competition problems. It can actually help *create* new mathematical knowledge, and even write research papers, sometimes with minimal human help.
What's the problem?
While AI has gotten really good at solving challenging math problems like those in the International Mathematical Olympiad, real mathematical research is much harder. It requires sifting through tons of existing work, building complex arguments over many steps, and ultimately, creating something new. Existing AI systems weren't equipped to handle this level of complexity and long-term planning.
What's the solution?
The researchers built Aletheia, which uses a powerful AI model called Gemini Deep Think. They also figured out a way to make the AI work even better on really hard problems, and they gave it access to tools that let it search for information and verify its work. Aletheia doesn't just *get* answers, it *shows its work* in natural language, generating, checking, and improving its solutions step-by-step. They demonstrated this by having Aletheia contribute to actual research, including writing a paper on its own and collaborating with humans on another.
Why it matters?
This work is important because it shows AI is starting to become a genuine partner in mathematical discovery. It's not just a calculator anymore; it can help mathematicians explore new ideas and potentially solve long-standing problems. The paper also suggests we need new ways to measure how much AI contributes to research, distinguishing between AI-generated results and human-AI collaborations.
Abstract
Recent advances in foundational models have yielded reasoning systems capable of achieving a gold-medal standard at the International Mathematical Olympiad. The transition from competition-level problem-solving to professional research, however, requires navigating vast literature and constructing long-horizon proofs. In this work, we introduce Aletheia, a math research agent that iteratively generates, verifies, and revises solutions end-to-end in natural language. Specifically, Aletheia is powered by an advanced version of Gemini Deep Think for challenging reasoning problems, a novel inference-time scaling law that extends beyond Olympiad-level problems, and intensive tool use to navigate the complexities of mathematical research. We demonstrate the capability of Aletheia from Olympiad problems to PhD-level exercises and most notably, through several distinct milestones in AI-assisted mathematics research: (a) a research paper (Feng26) generated by AI without any human intervention in calculating certain structure constants in arithmetic geometry called eigenweights; (b) a research paper (LeeSeo26) demonstrating human-AI collaboration in proving bounds on systems of interacting particles called independent sets; and (c) an extensive semi-autonomous evaluation (Feng et al., 2026a) of 700 open problems on Bloom's Erdos Conjectures database, including autonomous solutions to four open questions. In order to help the public better understand the developments pertaining to AI and mathematics, we suggest codifying standard levels quantifying autonomy and novelty of AI-assisted results. We conclude with reflections on human-AI collaboration in mathematics.