Bielik v3 Small: Technical Report
Krzysztof Ociepa, Łukasz Flis, Remigiusz Kinas, Krzysztof Wróbel, Adrian Gwoździej
2025-05-12
Summary
This paper talks about Bielik v3, a new set of AI text models that are really good at understanding and generating Polish language, even though they use fewer resources than other big models.
What's the problem?
The problem is that most advanced language models are designed for English or other major languages, so they don't work as well for Polish. Also, it's hard to make powerful models that don't require a lot of computer power or memory.
What's the solution?
The researchers built Bielik v3 using special techniques like a custom tokenizer made for Polish, a new way of teaching the model called Weighted Instruction Cross-Entropy Loss, and an Adaptive Learning Rate to help the model learn better. These tricks allow the model to be smaller but still perform really well.
Why it matters?
This matters because it makes high-quality AI tools available for Polish speakers, not just English speakers. It also shows how you can make strong language models that don't need massive computers, which is good for making AI more accessible and affordable.
Abstract
Bielik v3, a series of parameter-efficient generative text models, achieves high performance in Polish language processing with a custom tokenizer, Weighted Instruction Cross-Entropy Loss, and Adaptive Learning Rate.