An Open Recipe: Adapting Language-Specific LLMs to a Reasoning Model in One Day via Model Merging
Kunat Pipatanakul, Pittawat Taveekitworachai, Potsawee Manakul, Kasima Tharnpipitchai
2025-02-14
Summary
This paper talks about a new way to make language-specific AI models better at reasoning, focusing on improving a Thai language model by borrowing skills from a more advanced model called DeepSeek R1.
What's the problem?
AI models for less common languages, like Thai, aren't as good at reasoning as models for widely-used languages like English or Chinese. This is because most AI training focuses on these popular languages, leaving others behind. As a result, AI models for less common languages struggle with complex thinking tasks and can't switch between languages smoothly.
What's the solution?
The researchers found a way to teach the Thai AI model advanced reasoning skills from DeepSeek R1, a model that's really good at logical thinking. They did this by carefully choosing which data to use and figuring out how to combine the Thai model with parts of DeepSeek R1. Amazingly, they managed to do this using only public datasets and about $120 worth of computer time, which is super cheap for AI research.
Why it matters?
This matters because it shows we can make AI models for less common languages much smarter without spending a ton of money or needing special resources. It could help create better AI assistants and tools for people who speak languages that aren't as widely used in technology. This approach could lead to more equal access to advanced AI across different languages and cultures.
Abstract
This paper investigates data selection and model merging methodologies aimed at incorporating advanced reasoning capabilities such as those of DeepSeek R1 into language-specific large language models (LLMs), with a particular focus on the Thai LLM. Our goal is to enhance the reasoning capabilities of language-specific LLMs while maintaining their target language abilities. DeepSeek R1 excels in reasoning but primarily benefits high-resource languages such as English and Chinese. However, low-resource languages remain underserved due to the dominance of English-centric training data and model optimizations, which limit performance in these languages. This limitation results in unreliable code-switching and diminished effectiveness on tasks in low-resource languages. Meanwhile, local and regional LLM initiatives have attempted to bridge this gap by developing language-specific LLMs that focus on improving local linguistic fidelity. We demonstrate that, with only publicly available datasets and a computational budget of $120, it is possible to enhance the reasoning capabilities of language-specific LLMs to match the level of DeepSeek R1, without compromising their performance on target language tasks.