< Explain other AI papers

Atla Selene Mini: A General Purpose Evaluation Model

Andrei Alexandru, Antonia Calvi, Henry Broomfield, Jackson Golden, Kyle Dai, Mathias Leys, Maurice Burger, Max Bartolo, Roman Engeler, Sashank Pisupati, Toby Drane, Young Sun Park

2025-01-30

Atla Selene Mini: A General Purpose Evaluation Model

Summary

This paper talks about Atla Selene Mini, a new AI model designed to evaluate other AI models. It's like creating a super smart teacher that can grade the work of other AIs, even ones that are much bigger and more complex.

What's the problem?

Evaluating AI models is tricky because it's hard to create a fair and accurate grading system. Current methods often use big, complex AI models to judge other AIs, which can be biased or inaccurate. It's like asking a college professor to grade elementary school work - they might not understand what's important at that level.

What's the solution?

The researchers created Atla Selene Mini, a smaller AI model that's specifically trained to be a fair judge. They used a clever way of collecting and creating training data, combining real examples with artificially generated ones. They also used a special training method that helps the model learn to make good judgments and explain its reasoning. They tested Selene Mini on lots of different tasks and found that it performs better than much larger models, even in real-world situations like evaluating medical and financial information.

Why it matters?

This matters because as AI becomes more common in our daily lives, we need good ways to make sure these AIs are working correctly and safely. Selene Mini provides a tool that's both powerful and easy to use, which could help researchers and companies improve their AI models more quickly and accurately. By making this tool freely available, the researchers are helping the whole AI community work together to create better, safer AI systems for everyone.

Abstract

We introduce Atla Selene Mini, a state-of-the-art small language model-as-a-judge (SLMJ). Selene Mini is a general-purpose evaluator that outperforms the best SLMJs and GPT-4o-mini on overall performance across 11 out-of-distribution benchmarks, spanning absolute scoring, classification, and pairwise preference tasks. It is the highest-scoring 8B generative model on RewardBench, surpassing strong baselines like GPT-4o and specialized judges. To achieve this, we develop a principled data curation strategy that augments public datasets with synthetically generated critiques and ensures high quality through filtering and dataset ablations. We train our model on a combined direct preference optimization (DPO) and supervised fine-tuning (SFT) loss, and produce a highly promptable evaluator that excels in real-world scenarios. Selene Mini shows dramatically improved zero-shot agreement with human expert evaluations on financial and medical industry datasets. It is also robust to variations in prompt format. Preliminary results indicate that Selene Mini is the top-ranking evaluator in a live, community-driven Judge Arena. We release the model weights on HuggingFace (https://hf.co/AtlaAI/Selene-1-Mini-Llama-3.1-8B) and Ollama to encourage widespread community adoption.