Self-Taught Self-Correction for Small Language Models
Viktor Moskvoretskii, Chris Biemann, Irina Nikishina
2025-03-13
Summary
This paper talks about teaching small AI language models to fix their own mistakes using only their own practice data, like a student learning from their own homework errors.
What's the problem?
Small AI models often make mistakes and can’t fix themselves without help from bigger models or external tools, which is expensive and complicated.
What's the solution?
The STaSC method lets small AI models practice answering questions, spot their own errors, and improve through repeated training cycles using only their own generated answers.
Why it matters?
This helps smaller AI models work better on tasks like answering questions without needing huge resources, making them cheaper and more accessible for everyday use.
Abstract
Although large language models (LLMs) have achieved remarkable performance across various tasks, they remain prone to errors. A key challenge is enabling them to self-correct. While prior research has relied on external tools or large proprietary models, this work explores self-correction in small language models (SLMs) through iterative fine-tuning using solely self-generated data. We introduce the Self-Taught Self-Correction (STaSC) algorithm, which incorporates multiple algorithmic design choices. Experimental results on a question-answering task demonstrate that STaSC effectively learns self-correction, leading to significant performance improvements. Our analysis further provides insights into the mechanisms of self-correction and the impact of different design choices on learning dynamics and overall performance. To support future research, we release our user-friendly codebase and lightweight models.