OpenCharacter: Training Customizable Role-Playing LLMs with Large-Scale Synthetic Personas
Xiaoyang Wang, Hongming Zhang, Tao Ge, Wenhao Yu, Dian Yu, Dong Yu
2025-01-28
Summary
This paper talks about a new AI model called Mixture-of-Mamba, which improves on existing State Space Models (SSMs) by making them better at handling different types of data (like text, images, and speech) at the same time. It does this by introducing 'modality-aware sparsity,' which allows the model to process each type of data more efficiently.
What's the problem?
State Space Models are good at processing sequences of data efficiently, but they struggle when dealing with multiple types of data at once (like combining text and images). This limits their usefulness in tasks that require understanding different kinds of information together.
What's the solution?
The researchers created Mixture-of-Mamba, which adapts the SSM architecture to handle multiple types of data more effectively. They tested it on three different setups that combine text with images or speech. The new model was able to achieve the same quality of results as previous models but using much less computational power - in some cases, using less than half the resources.
Why it matters?
This research matters because it shows a way to make AI models that can understand multiple types of information (like text, images, and speech) more efficiently. This could lead to more powerful and versatile AI systems that use less energy and computing resources. It's particularly important for developing AI that can interact with the world in more human-like ways, understanding both what it sees and hears. The efficiency gains could also make it easier and cheaper to run these complex AI models, potentially making advanced AI more accessible for various applications.
Abstract
Customizable role-playing in large language models (LLMs), also known as character generalization, is gaining increasing attention for its versatility and cost-efficiency in developing and deploying role-playing dialogue agents. This study explores a large-scale data synthesis approach to equip LLMs with character generalization capabilities. We begin by synthesizing large-scale character profiles using personas from Persona Hub and then explore two strategies: response rewriting and response generation, to create character-aligned instructional responses. To validate the effectiveness of our synthetic instruction tuning data for character generalization, we perform supervised fine-tuning (SFT) using the LLaMA-3 8B model. Our best-performing model strengthens the original LLaMA-3 8B Instruct model and achieves performance comparable to GPT-4o models on role-playing dialogue. We release our synthetic characters and instruction-tuning dialogues to support public research.