Hermes 4 Technical Report

Ryan Teknium, Roger Jin, Jai Suphavadeeprasit, Dakota Mahan, Jeffrey Quesnelle, Joe Li, Chen Guang, Shannon Sands, Karan Malhotra

2025-08-26

Summary

This paper introduces Hermes 4, a new type of AI model that's good at both following instructions and thinking through problems step-by-step, like a human would when solving a complex task.

What's the problem?

Creating AI that can truly *reason* is hard. Existing models often either excel at following directions but struggle with multi-step problems, or they can solve problems but don't understand what you want them to do. Building a model that does both well requires a lot of carefully prepared data, a good training process, and ways to accurately measure how well it's working. Getting all of this to work at a large scale is a significant challenge.

What's the solution?

The researchers built Hermes 4 by combining different AI techniques. They focused on creating a large dataset specifically designed to teach the model to reason through problems over multiple steps while also being able to understand and follow a wide range of instructions. They then trained the model and tested it on various tasks like math, coding, and general knowledge to see how well it performed, and also looked closely at *how* the model arrived at its answers.

Why it matters?

Hermes 4 represents a step forward in building more capable and reliable AI. By creating a model that can both reason and follow instructions, it opens up possibilities for AI to be used in more complex and helpful ways, like assisting with problem-solving, writing code, or understanding complicated information. Importantly, the researchers are sharing the model's details publicly, allowing other researchers to build upon their work and accelerate progress in the field.

Abstract

We present Hermes 4, a family of hybrid reasoning models that combine structured, multi-turn reasoning with broad instruction-following ability. We describe the challenges encountered during data curation, synthesis, training, and evaluation, and outline the solutions employed to address these challenges at scale. We comprehensively evaluate across mathematical reasoning, coding, knowledge, comprehension, and alignment benchmarks, and we report both quantitative performance and qualitative behavioral analysis. To support open research, all model weights are published publicly at https://huggingface.co/collections/NousResearch/hermes-4-collection-68a731bfd452e20816725728

View Paper