Key Features

Full duplex model for natural conversations
Customizable voice and role through text prompts
Handles interruptions, backchannels, and conversational rhythm
Low-latency interaction with dual-stream configuration
Built on Moshi architecture with 7 billion parameters
Trained on blend of real and synthetic conversations
Demonstrates strong generalization to new scenarios
Outperforms other conversational AI agents on key metrics

PersonaPlex uses two inputs to define conversational behavior: a voice prompt that captures vocal characteristics, speaking style, and prosody, and a text prompt that describes the role, background information, and conversation context. These inputs are processed jointly to create a coherent persona. The model is built on the Moshi architecture and has 7 billion parameters, with a dual-stream configuration that allows listening and speaking to occur concurrently, enabling natural conversational dynamics.


PersonaPlex has been trained on a blend of real and synthetic conversations, including 7,303 real conversations from the Fisher English corpus and 39,322 synthetic assistant role conversations. The model demonstrates strong generalization to text prompts well outside its training distribution and maintains a persona coherent with the text prompt throughout extended interactions. PersonaPlex outperforms other conversational AI agents on conversational dynamics, response and interruption latency, and task adherence in both question-answering assistant and customer service roles.

Get more likes & reach the top of search results by adding this button on your site!

Embed button preview - Light theme
Embed button preview - Dark theme
TurboType Banner
Zero to AI Engineer Program

Zero to AI Engineer

Skip the degree. Learn real-world AI skills used by AI researchers and engineers. Get certified in 8 weeks or less. No experience required.

Subscribe to the AI Search Newsletter

Get top updates in AI to your inbox every weekend. It's free!