CLEAR: Error Analysis via LLM-as-a-Judge Made Easy

Asaf Yehudai, Lilach Eden, Yotam Perlitz, Roy Bar-Haim, Michal Shmueli-Scheuer

2025-07-28

CLEAR: Error Analysis via LLM-as-a-Judge Made Easy

Summary

This paper talks about CLEAR, a tool that uses large language models as judges to analyze errors made by AI systems and offers detailed feedback with helpful visuals.

What's the problem?

Understanding why AI models make mistakes can be hard and time-consuming, especially without clear explanations or ways to see what went wrong.

What's the solution?

The researchers developed CLEAR, which interacts with large language models to automatically check errors, explain them in detail, and create visual reports that help users easily find and understand performance problems.

Why it matters?

This matters because CLEAR helps developers fix AI models more quickly and accurately by giving them clear insights into errors, leading to better and more reliable AI systems.

Abstract

CLEAR is an interactive, open-source package for LLM-based error analysis that provides detailed feedback and visualizations to understand specific performance issues.

View Paper