Synthesizing Behaviorally-Grounded Reasoning Chains: A Data-Generation Framework for Personal Finance LLMs

Akhil Theerthala

2025-09-18

Synthesizing Behaviorally-Grounded Reasoning Chains: A Data-Generation Framework for Personal Finance LLMs

Summary

This research focuses on building better AI financial advisors. Current AI systems struggle to give truly helpful, personalized financial advice and often cost a lot to maintain, without delivering great results.

What's the problem?

Existing AI tools for finance either help investors with specific tasks or try to handle broad financial planning, but both approaches have issues. The simpler tools aren't comprehensive, and the complex ones are expensive to run and don't actually improve financial outcomes by much – often less than 25% of what they *should* be achieving. A key issue is that these systems don't fully understand how people actually make financial decisions, which is influenced by psychology and individual circumstances.

What's the solution?

The researchers created a new method to train an AI model, specifically Qwen-3-8B, to be a more effective financial advisor. They built a large dataset of 19,000 examples of financial reasoning, incorporating insights from behavioral finance (the study of how people behave with money). This carefully curated data was used to fine-tune the AI model. The result is an 8 billion parameter model that performs as well as much larger models (14-32 billion parameters) in terms of accuracy, how naturally it communicates, and how well it personalizes advice.

Why it matters?

This work is important because it shows you can build a powerful and personalized financial advisor AI without needing a massive, expensive model. By focusing on high-quality data that understands human behavior, they achieved comparable performance to much larger models at 80% lower cost. This makes truly helpful and affordable AI financial advice more accessible to everyone.

Abstract

Personalized financial advice requires consideration of user goals, constraints, risk tolerance, and jurisdiction. Prior LLM work has focused on support systems for investors and financial planners. Simultaneously, numerous recent studies examine broader personal finance tasks, including budgeting, debt management, retirement, and estate planning, through agentic pipelines that incur high maintenance costs, yielding less than 25% of their expected financial returns. In this study, we introduce a novel and reproducible framework that integrates relevant financial context with behavioral finance studies to construct supervision data for end-to-end advisors. Using this framework, we create a 19k sample reasoning dataset and conduct a comprehensive fine-tuning of the Qwen-3-8B model on the dataset. Through a held-out test split and a blind LLM-jury study, we demonstrate that through careful data curation and behavioral integration, our 8B model achieves performance comparable to significantly larger baselines (14-32B parameters) across factual accuracy, fluency, and personalization metrics while incurring 80% lower costs than the larger counterparts.

View Paper