CoSER: Coordinating LLM-Based Persona Simulation of Established Roles
Xintao Wang, Heng Wang, Yifei Zhang, Xinfeng Yuan, Rui Xu, Jen-tse Huang, Siyu Yuan, Haoran Guo, Jiangjie Chen, Wei Wang, Yanghua Xiao, Shuchang Zhou
2025-02-14
Summary
This paper talks about CoSER, a new system that helps AI language models better imitate characters from books. It includes a large collection of character data, new AI models, and ways to test how well these models can act like book characters.
What's the problem?
Current AI models struggle to accurately portray established characters from books because they don't have enough real examples of how these characters talk and think. It's also hard to tell how well the AI is doing at imitating these characters.
What's the solution?
The researchers created CoSER, which does three main things. First, they made a huge dataset with nearly 18,000 characters from 771 famous books, including not just what the characters say, but also their thoughts and experiences. Second, they developed new AI models (CoSER 8B and 70B) trained on this data. Finally, they came up with a new way to test these models by having them act out scenes from books, playing multiple characters in a row.
Why it matters?
This matters because it could lead to more realistic and engaging AI characters in games, virtual assistants, or educational tools. The CoSER system performs better than some of the most advanced AI models out there, which means it's a big step forward in making AI that can understand and imitate human-like personalities. This could make interactions with AI feel more natural and personalized in the future.
Abstract
Role-playing language agents (RPLAs) have emerged as promising applications of large language models (<PRE_TAG>LLMs)</POST_TAG>. However, simulating established characters presents a challenging task for RPLAs, due to the lack of authentic character datasets and nuanced evaluation methods using such data. In this paper, we present CoSER, a collection of a high-quality dataset, open models, and an evaluation protocol towards effective RPLAs of established characters. The CoSER dataset covers 17,966 characters from 771 renowned books. It provides authentic dialogues with real-world intricacies, as well as diverse data types such as <PRE_TAG>conversation setups</POST_TAG>, character experiences and internal thoughts. Drawing from acting methodology, we introduce given-circumstance acting for training and evaluating role-playing LLMs, where LLMs sequentially portray multiple characters in book scenes. Using our dataset, we develop CoSER 8B and CoSER 70B, i.e., advanced open role-playing <PRE_TAG>LLMs</POST_TAG> built on LLaMA-3.1 models. Extensive experiments demonstrate the value of the CoSER dataset for RPLA training, evaluation and retrieval. Moreover, CoSER 70B exhibits state-of-the-art performance surpassing or matching GPT-4o on our evaluation and three existing benchmarks, i.e., achieving 75.80% and 93.47% accuracy on the InCharacter and LifeChoice benchmarks respectively.