PersonaFeedback: A Large-scale Human-annotated Benchmark For Personalization

Meiling Tao, Chenghao Zhu, Dongyi Ding, Tiannan Wang, Yuchen Eleanor Jiang, Wangchunshu Zhou

2025-06-17

PersonaFeedback: A Large-scale Human-annotated Benchmark For
Personalization

Summary

This paper talks about PersonaFeedback, a new large-scale benchmark made to test how well Large Language Models (LLMs) can create personalized responses when given specific details about a user's personality or preferences, called personas. It measures how these models adapt their answers to fit different users based on explicit persona information.

What's the problem?

The problem is that current AI language models often cannot reliably adjust their responses to reflect different user personalities or preferences, even when they are given clear information about a person. This means the AI may give generic or mismatched answers, limiting how useful and engaging it can be in personalized interactions.

What's the solution?

The solution was to build PersonaFeedback, which collects human-annotated examples where AI responses are judged for how well they match given personas. By using this benchmark, researchers can evaluate and identify the strengths and weaknesses of existing models in personalization, providing a way to improve AI systems so they better tailor their replies to individual users.

Why it matters?

This matters because personalization in AI helps make conversations feel more natural and relevant to each person, enhancing user experience in applications like virtual assistants, tutoring, or customer service. Having a strong benchmark like PersonaFeedback allows for systematic improvement, pushing AI to understand and respond better to diverse user needs and personalities.

Abstract

A new benchmark, PersonaFeedback, evaluates Large Language Models' ability to generate personalized responses given explicit user personas, revealing limitations in current systems.

View Paper