Bielik 11B v2 Technical Report

Krzysztof Ociepa, Łukasz Flis, Krzysztof Wróbel, Adrian Gwoździej, Remigiusz Kinas

2025-05-12

Summary

This paper talks about Bielik 11B v2, a large language model specially designed for Polish that performs extremely well on Polish language tests, even compared to some bigger models.

What's the problem?

The problem is that most powerful language models are made for English or other widely spoken languages, so they don't work as well for Polish. Also, running huge models usually takes a lot of computer power, which can be expensive and hard to manage.

What's the solution?

The researchers created Bielik 11B v2 with 11 billion parameters and used special training methods like Weighted Instruction Cross-Entropy Loss and Adaptive Learning Rate. These methods help the model learn more efficiently and perform better, even with fewer resources, allowing it to beat some larger models on Polish tasks.

Why it matters?

This matters because it gives Polish speakers access to top-level AI tools in their own language, and it shows that you don't always need the biggest model to get the best results. It also helps make advanced AI more affordable and practical for people and companies who use Polish.

Abstract

Bielik 11B v2, a scaled language model with 11B parameters, excels on Polish benchmarks through Weighted Instruction Cross-Entropy Loss and Adaptive Learning Rate, outperforming larger models and demonstrating resource-efficient deployment.

View Paper