Jan-nano Technical Report
Alan Dao, Dinh Bach Vu
2025-06-30
Summary
This paper talks about Jan-nano, a language model with 4 billion parameters that is designed to be very efficient and powerful by focusing on finding information quickly instead of trying to know everything.
What's the problem?
Most language models need a lot of computing power to be really good, which makes them hard to run on regular computers and slows things down.
What's the solution?
Jan-nano uses a new training method called multi-stage Reinforcement Learning with Verifiable Rewards that skips some traditional training steps and specializes the model to work well with tools and external information. It also supports very long context windows, allowing it to understand and remember lots of information at once, and it runs smoothly on consumer-grade hardware.
Why it matters?
This matters because Jan-nano shows that smart design and training can make AI models powerful yet efficient enough to run on everyday computers, making advanced AI more accessible and practical for real-world research and applications.
Abstract
Jan-nano, a 4B parameter language model, achieves high efficiency and performance through specialized fine-tuning and multi-stage RLVR, operating on consumer hardware with a 128K context length.