Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning
Maggie Huan, Yuetai Li, Tuney Zheng, Xiaoyu Xu, Seungone Kim, Minxin Du, Radha Poovendran, Graham Neubig, Xiang Yue
2025-07-02
Summary
This paper talks about how different ways of training large language models affect their ability to reason and solve problems not just in math but in other areas too. It explains how models trained with reinforcement learning generalize better across different tasks than those trained with supervised fine-tuning.
What's the problem?
The problem is that while some language models get very good at math reasoning through training, they don’t always do well when asked to solve problems in other subjects or real-world tasks. This shows that improving math skills alone might not help overall reasoning abilities.
What's the solution?
The researchers compared models trained with reinforcement learning and supervised fine-tuning on math data and found that reinforcement learning helps models keep their general problem-solving skills across many tasks and domains. Supervised fine-tuning often makes models forget what they learned before, causing poor transferability.
Why it matters?
This matters because it shows that how we train language models affects not only their math skills but also their general intelligence. Understanding this helps researchers create better models that can solve a wider variety of problems and adapt to many different tasks.
Abstract
Reinforcement learning-tuned models generalize better across domains compared to supervised fine-tuned models in reasoning tasks, indicating a need to reconsider standard training methods.