Know You First and Be You Better: Modeling Human-Like User Simulators via Implicit Profiles
Kuang Wang, Xianfei Li, Shenghao Yang, Li Zhou, Feng Jiang, Haizhou Li
2025-03-10
Summary
This paper talks about a new way to make AI chatbots that act more like real people by giving them hidden personalities and goals, called User Simulator with implicit Profiles (USP)
What's the problem?
Current AI chatbots used for testing and improving dialogue systems don't really capture how real people talk and behave. They either miss important things like personality and speaking style, or they use pre-made profiles that don't work well for different situations
What's the solution?
The researchers created USP, which figures out a user's personality and goals from conversations, and then uses this information to make the chatbot talk more realistically. They used advanced AI techniques to teach the system how to understand people better and create more natural conversations. They also made sure the system could create a wide variety of different user types to match real-world diversity
Why it matters?
This matters because better AI chatbots can help improve real chatbots and virtual assistants we use every day. By making test chatbots that act more like real people, developers can create better systems that understand and respond to us more naturally. This could lead to more helpful and user-friendly AI assistants in the future
Abstract
User simulators are crucial for replicating human interactions with dialogue systems, supporting both collaborative training and automatic evaluation, especially for large language models (LLMs). However, existing simulators often rely solely on text utterances, missing implicit user traits such as personality, speaking style, and goals. In contrast, persona-based methods lack generalizability, as they depend on predefined profiles of famous individuals or archetypes. To address these challenges, we propose User Simulator with implicit Profiles (USP), a framework that infers implicit user profiles from human-machine conversations and uses them to generate more personalized and realistic dialogues. We first develop an LLM-driven extractor with a comprehensive profile schema. Then, we refine the simulation through conditional supervised fine-tuning and reinforcement learning with cycle consistency, optimizing it at both the utterance and conversation levels. Finally, we adopt a diverse profile sampler to capture the distribution of real-world user profiles. Experimental results demonstrate that USP outperforms strong baselines in terms of authenticity and diversity while achieving comparable performance in consistency. Furthermore, dynamic multi-turn evaluations based on USP strongly align with mainstream benchmarks, demonstrating its effectiveness in real-world applications.