DeepCritic: Deliberate Critique with Large Language Models

Wenkai Yang, Jingwen Chen, Yankai Lin, Ji-Rong Wen

2025-05-02

DeepCritic: Deliberate Critique with Large Language Models

Summary

This paper talks about DeepCritic, a new system that helps large language models get better at checking math solutions by teaching them to give detailed feedback and learn from their mistakes.

What's the problem?

AI models often miss errors or don't give helpful feedback when reviewing math problems, which means they aren't as useful for spotting mistakes or helping people learn.

What's the solution?

The researchers created a two-step process where the AI first writes out detailed, step-by-step critiques of math solutions, then uses reinforcement learning to get even better at finding and fixing errors.

Why it matters?

This matters because it makes AI tools much more helpful for students and teachers by providing clearer explanations and catching more mistakes, which can improve learning and understanding in math.

Abstract

A novel two-stage framework using Qwen2.5-72B-Instruct enhances LLMs' math critique ability by generating detailed step-wise critiques and applying reinforcement learning, resulting in better error identification and refinement.

View Paper