T-pro 2.0: An Efficient Russian Hybrid-Reasoning Model and Playground

Dmitrii Stoianov, Danil Taranets, Olga Tsymboi, Ramil Latypov, Almaz Dautov, Vladislav Kruglikov, Nikita Surkov, German Abramov, Pavel Gein, Dmitry Abulkhanov, Mikhail Gashkov, Viktor Zelenkovskiy, Artem Batalov, Aleksandr Medvedev, Anatolii Potapov

2025-12-12

T-pro 2.0: An Efficient Russian Hybrid-Reasoning Model and Playground

Summary

This paper introduces T-pro 2.0, a new, openly available large language model (LLM) specifically designed for the Russian language. It's built to be good at both directly answering questions and showing its reasoning process, while also being fast and efficient.

What's the problem?

There's a lack of powerful, openly accessible LLMs for the Russian language. Existing models often aren't available for researchers to study or build upon, and many struggle with complex reasoning tasks. Also, making these models run quickly can be a challenge.

What's the solution?

The creators of this paper built T-pro 2.0 from the ground up, focusing on the Russian language and using a special method for processing Cyrillic characters. They also improved the speed of the model using a technique called 'speculative decoding.' To help others, they’ve released not only the model itself, but also the data used to train it, a benchmark for testing reasoning skills, and the tools used to make it run faster, all on a platform called Hugging Face.

Why it matters?

T-pro 2.0 is important because it provides a free and open platform for researchers and developers to explore and improve Russian language AI. It allows for more study of how these models reason in Russian, and it makes it easier to create practical applications that use Russian language processing, like chatbots or translation tools.

Abstract

We introduce T-pro 2.0, an open-weight Russian LLM for hybrid reasoning and efficient inference. The model supports direct answering and reasoning-trace generation, using a Cyrillic-dense tokenizer and an adapted EAGLE speculative-decoding pipeline to reduce latency. To enable reproducible and extensible research, we release the model weights, the T-Wix 500k instruction corpus, the T-Math reasoning benchmark, and the EAGLE weights on Hugging Face. These resources allow users to study Russian-language reasoning and to extend or adapt both the model and the inference pipeline. A public web demo exposes reasoning and non-reasoning modes and illustrates the speedups achieved by our inference stack across domains. T-pro 2.0 thus serves as an accessible open system for building and evaluating efficient, practical Russian LLM applications.

View Paper