The Generative Energy Arena (GEA): Incorporating Energy Awareness in Large Language Model (LLM) Human Evaluations

Carlos Arriaga, Gonzalo Martínez, Eneko Sendin, Javier Conde, Pedro Reviriego

2025-07-21

The Generative Energy Arena (GEA): Incorporating Energy Awareness in
Large Language Model (LLM) Human Evaluations

Summary

This paper talks about the Generative Energy Arena (GEA), a platform that helps people evaluate large language models (LLMs) by showing them how much energy each model uses while generating responses.

What's the problem?

The problem is that most ways to judge LLMs focus only on how good their answers are, but they don’t consider how much energy the models consume, which is important because bigger models use a lot of energy and can harm the environment.

What's the solution?

The authors created GEA, where users compare answers from different LLMs and are also informed about the relative energy use of each model. By doing this, GEA helps people see the trade-off between energy consumption and answer quality when choosing which model to prefer.

Why it matters?

This matters because it encourages the use of smaller, energy-efficient models when they give good enough answers, helping reduce the environmental impact of running large AI systems while still providing useful AI services.

Abstract

GEA, a public arena that incorporates energy consumption data, shows that users often prefer smaller, more energy-efficient language models over larger, more complex ones.

View Paper