Hala Technical Report: Building Arabic-Centric Instruction & Translation Models at Scale

Hasan Abed Al Kader Hammoud, Mohammad Zbeeb, Bernard Ghanem

2025-09-18

Hala Technical Report: Building Arabic-Centric Instruction & Translation Models at Scale

Summary

This paper introduces Hala, a new set of language models specifically designed for Arabic. These models are good at both understanding instructions and translating between Arabic and English.

What's the problem?

Arabic language processing often lags behind English because there aren't as many high-quality datasets and models available. Existing models often don't perform as well on Arabic tasks, and building them from scratch is very resource intensive. Essentially, it's hard to get computers to understand and generate Arabic text as well as they do with English.

What's the solution?

The researchers created Hala by first making an existing English-Arabic translation model more efficient. Then, they used this improved model to translate a large amount of English instructional data into Arabic, creating a new dataset tailored for Arabic language understanding. Finally, they trained several Hala models of different sizes on this new dataset, carefully balancing Arabic-specific knowledge with the general knowledge already present in the base model. They used a technique called 'slerp merging' to achieve this balance.

Why it matters?

Hala represents a significant step forward for Arabic NLP. It achieves the best performance on several Arabic language tasks compared to other models of similar size, and the researchers are making their models and data publicly available. This will help other researchers build even better Arabic language technologies, ultimately making AI more accessible and useful for Arabic speakers.

Abstract

We present Hala, a family of Arabic-centric instruction and translation models built with our translate-and-tune pipeline. We first compress a strong ARleftrightarrowEN teacher to FP8 (yielding sim2times higher throughput with no quality loss) and use it to create high-fidelity bilingual supervision. A lightweight language model LFM2-1.2B is then fine-tuned on this data and used to translate high-quality English instruction sets into Arabic, producing a million-scale corpus tailored to instruction following. We train Hala models at 350M, 700M, 1.2B, and 9B parameters, and apply slerp merging to balance Arabic specialization with base-model strengths. On Arabic-centric benchmarks, Hala achieves state-of-the-art results within both the "nano" (leq2B) and "small" (7-9B) categories, outperforming their bases. We release models, data, evaluation, and recipes to accelerate research in Arabic NLP.

View Paper