DotaMath: Decomposition of Thought with Code Assistance and Self-correction for Mathematical Reasoning

Chengpeng Li, Guanting Dong, Mingfeng Xue, Ru Peng, Xiang Wang, Dayiheng Liu

2024-07-08

DotaMath: Decomposition of Thought with Code Assistance and Self-correction for Mathematical Reasoning

Summary

This paper talks about DotaMath, a new approach designed to improve how large language models (LLMs) solve complex math problems by breaking them down into simpler parts and using code to assist in finding solutions.

What's the problem?

The main problem is that while LLMs have become good at solving simple math problems, they still struggle with more complicated ones. Traditional methods don't effectively help these models understand and solve difficult math tasks, leading to mistakes and misunderstandings.

What's the solution?

To tackle this issue, the authors introduce DotaMath, which uses a method called 'Decomposition of Thought.' This means the model breaks complex math problems into smaller, manageable subtasks. It then uses code to solve these subtasks and gets feedback from a code interpreter to correct any errors. The researchers also created a large dataset called DotaMathQA with over 574,000 examples of math problems to train the models effectively. By using this approach, DotaMath models perform significantly better on both familiar and new math challenges compared to existing models.

Why it matters?

This research is important because it enhances the ability of AI systems to handle complex mathematical reasoning. By improving how these models learn and solve problems, DotaMath can be used in educational tools, helping students understand difficult concepts better and providing accurate solutions in various applications where math is essential.

Abstract

Large language models (LLMs) have made impressive progress in handling simple math problems, yet they still struggle with more challenging and complex mathematical tasks. In this paper, we introduce a series of LLMs that employs the Decomposition of thought with code assistance and self-correction for mathematical reasoning, dubbed as DotaMath. DotaMath models tackle complex mathematical tasks by decomposing them into simpler logical subtasks, leveraging code to solve these subtasks, obtaining fine-grained feedback from the code interpreter, and engaging in self-reflection and correction. By annotating diverse interactive tool-use trajectories and employing query evolution on GSM8K and MATH datasets, we generate an instruction fine-tuning dataset called DotaMathQA with 574K query-response pairs. We train a series of base LLMs using imitation learning on DotaMathQA, resulting in DotaMath models that achieve remarkable performance compared to open-source LLMs across various in-domain and out-of-domain benchmarks. Notably, DotaMath-deepseek-7B showcases an outstanding performance of 64.8% on the competitive MATH dataset and 86.7% on GSM8K. Besides, DotaMath-deepseek-7B maintains strong competitiveness on a series of in-domain and out-of-domain benchmarks (Avg. 80.1%). Looking forward, we anticipate that the DotaMath paradigm will open new pathways for addressing intricate mathematical problems. Our code is publicly available at https://github.com/ChengpengLi1003/DotaMath.

View Paper