Demystifying Domain-adaptive Post-training for Financial LLMs
Zixuan Ke, Yifei Ming, Xuan-Phi Nguyen, Caiming Xiong, Shafiq Joty
2025-01-13

Summary
This paper talks about making AI language models better at understanding finance by teaching them specifically about money and business stuff. The researchers created a new method called FINDAP to figure out the best way to do this.
What's the problem?
Big AI language models are good at lots of things, but they're not experts in specific areas like finance. It's hard to figure out how to teach them about finance without messing up what they already know. Also, it's tricky to know exactly what to teach them and how to check if they've learned it well.
What's the solution?
The researchers made a step-by-step plan called FINDAP. First, they figured out what a finance expert AI should know. Then, they made tests to check if the AI learned these things. They tried different ways of teaching the AI, like showing it lots of finance text, giving it specific instructions, and teaching it what good answers look like. They came up with a clever way to pick the most important finance info to teach the AI. In the end, they created a new AI called Llama-Fin that's really good at finance tasks.
Why it matters?
This research matters because it helps make AI smarter about specific topics like finance, which could be super useful for businesses and people working with money. It shows a way to create AIs that are experts in certain fields without forgetting other important stuff. This could lead to AIs that are better at helping with complicated jobs in areas like banking, investing, or economic planning. Plus, the method they created might help make expert AIs for other fields too, not just finance.
Abstract
Domain-adaptive post-training of large language models (LLMs) has emerged as a promising approach for specialized domains such as medicine and finance. However, significant challenges remain in identifying optimal adaptation criteria and training strategies across varying data and model configurations. To address these challenges, we introduce FINDAP, a systematic and fine-grained investigation into domain-adaptive post-training of LLMs for the finance domain. Our approach begins by identifying the core capabilities required for the target domain and designing a comprehensive evaluation suite aligned with these needs. We then analyze the effectiveness of key post-training stages, including continual pretraining, instruction tuning, and preference alignment. Building on these insights, we propose an effective training recipe centered on a novel preference data distillation method, which leverages process signals from a generative reward model. The resulting model, Llama-Fin, achieves state-of-the-art performance across a wide range of financial tasks. Our analysis also highlights how each post-training stage contributes to distinct capabilities, uncovering specific challenges and effective solutions, providing valuable insights for domain adaptation of LLMs. Project page: https://github.com/SalesforceAIResearch/FinDap