Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate
Yubo Wang, Xiang Yue, Wenhu Chen
2025-01-30
Summary
This paper talks about a new way to train AI language models called Critique Fine-Tuning (CFT). Instead of just teaching AI to copy correct answers, CFT teaches it to analyze and critique responses, which helps the AI think more critically and understand things better.
What's the problem?
The current way of training AI language models, called Supervised Fine-Tuning (SFT), focuses on making the AI copy correct answers. This method doesn't really teach the AI to think deeply or understand the nuances of problems, especially in areas like math where critical thinking is important.
What's the solution?
The researchers created CFT, which trains AI to critique answers instead of just copying them. They made a dataset of 50,000 examples where a smart AI (GPT-4o) critiqued answers to math problems. They then used this to train different AI models and compared the results to the old SFT method. CFT consistently performed 4-10% better on math tests, even when using much less training data than other methods.
Why it matters?
This matters because it shows a more efficient way to make AI smarter, especially in areas that require deep thinking like math. CFT helps AI learn to think critically, not just memorize answers, which is closer to how humans learn. This could lead to AI that's better at solving complex problems and reasoning in various fields. It's also more efficient, needing less data to achieve better results, which could make developing advanced AI easier and more accessible.
Abstract
Supervised Fine-Tuning (SFT) is commonly used to train language models to imitate annotated responses for given instructions. In this paper, we challenge this paradigm and propose Critique Fine-Tuning (CFT), a strategy where models learn to critique noisy responses rather than simply imitate correct ones. Inspired by human learning processes that emphasize critical thinking, CFT encourages deeper analysis and nuanced understanding-traits often overlooked by standard SFT. To validate the effectiveness of CFT, we construct a 50K-sample dataset from WebInstruct, using GPT-4o as the teacher to generate critiques in the form of (input=[query; noisy response], output=critique). CFT on this dataset yields a consistent 4-10% improvement over SFT on six math benchmarks with different base models like Qwen2.5, Qwen2.5-Math and DeepSeek-Math. We further expand to MetaMath and NuminaMath datasets and observe similar gains over SFT. Notably, our Qwen2.5-Math-CFT model-trained on just 50K samples-matches or outperforms competitive models such as AceMath and Qwen2.5-Math-Instruct on most benchmarks, both of which use over 2M samples. Ablation studies show that CFT is robust to the source of noisy response and teacher critique model. Through these findings, we argue that critique-based training offers a more effective alternative to advance the reasoning of language models.