TabDSR: Decompose, Sanitize, and Reason for Complex Numerical Reasoning in Tabular Data
Changjiang Jiang, Fengchang Yu, Haihua Chen, Wei Lu, Jin Zeng
2025-11-05
Summary
This paper introduces a new system, called \method, designed to help large language models (LLMs) get better at answering complex questions based on information found in tables, like spreadsheets or databases.
What's the problem?
LLMs often struggle with tasks that require reasoning about data in tables. This is because questions can be complicated, the data itself might have errors or irrelevant information, and LLMs aren't naturally good at doing math or precise numerical calculations. Basically, they have trouble turning table data into accurate answers for tricky questions.
What's the solution?
The researchers created \method, which works in three main steps. First, it breaks down a complicated question into smaller, more manageable parts. Second, it cleans up the table by removing errors and focusing on the most relevant data. Finally, it uses a 'program-of-thoughts' approach, meaning it generates code that can be executed to perform calculations and find the final answer from the cleaned-up table. They also created a new dataset, CalTab151, to fairly test these kinds of systems and prevent cheating.
Why it matters?
This work is important because it significantly improves the ability of LLMs to handle real-world data analysis tasks. By making LLMs better at understanding and reasoning with tabular data, we can unlock their potential for a wider range of applications, like financial analysis, scientific research, and business intelligence. The system outperforms existing methods and works well with many different LLMs.
Abstract
Complex reasoning over tabular data is crucial in real-world data analysis, yet large language models (LLMs) often underperform due to complex queries, noisy data, and limited numerical capabilities. To address these issues, we propose \method, a framework consisting of: (1) a query decomposer that breaks down complex questions, (2) a table sanitizer that cleans and filters noisy tables, and (3) a program-of-thoughts (PoT)-based reasoner that generates executable code to derive the final answer from the sanitized table. To ensure unbiased evaluation and mitigate data leakage, we introduce a new dataset, CalTab151, specifically designed for complex numerical reasoning over tables. Experimental results demonstrate that \method consistently outperforms existing methods, achieving state-of-the-art (SOTA) performance with 8.79%, 6.08%, and 19.87% accuracy improvement on TAT-QA, TableBench, and \method, respectively. Moreover, our framework integrates seamlessly with mainstream LLMs, providing a robust solution for complex tabular numerical reasoning. These findings highlight the effectiveness of our framework in enhancing LLM performance for complex tabular numerical reasoning. Data and code are available upon request.