Genius: A Generalizable and Purely Unsupervised Self-Training Framework For Advanced Reasoning

Fangzhi Xu, Hang Yan, Chang Ma, Haiteng Zhao, Qiushi Sun, Kanzhi Cheng, Junxian He, Jun Liu, Zhiyong Wu

2025-04-16

Genius: A Generalizable and Purely Unsupervised Self-Training Framework
For Advanced Reasoning

Summary

This paper talks about Genius, a new way to train large language models (LLMs) so they can get better at reasoning and solving problems on their own, without needing humans to guide or correct them during training.

What's the problem?

The problem is that most advanced AI models need lots of labeled data and human feedback to learn how to reason well, which takes a lot of time and effort. This makes it hard to improve these models quickly or use them in areas where labeled data is hard to get.

What's the solution?

The researchers created a system called Genius that lets the model teach itself using a process called unsupervised self-training. It uses a special method where the model tries out different steps to solve a problem, looks ahead to see which steps are most promising, and then fine-tunes itself based on which answers seem best, all without needing outside help or labeled answers.

Why it matters?

This matters because it means AI models can get smarter and better at reasoning without needing tons of human input, which saves time and resources. It also makes it possible to improve AI in areas where humans can't easily provide the right answers, making these models more flexible and useful in the real world.

Abstract

Genius, an unsupervised self-training framework, enhances LLM reasoning by employing a stepwise foresight re-sampling strategy and advantage-calibrated optimization to optimize responses without external supervision.

View Paper