NVIDIA PersonaPlex

NEW

Paid Communication Customer Service

LikeWebsite Promote

Key Features

Full duplex model for natural conversations

Customizable voice and role through text prompts

Handles interruptions, backchannels, and conversational rhythm

Low-latency interaction with dual-stream configuration

Built on Moshi architecture with 7 billion parameters

Trained on blend of real and synthetic conversations

Demonstrates strong generalization to new scenarios

Outperforms other conversational AI agents on key metrics

PersonaPlex uses two inputs to define conversational behavior: a voice prompt that captures vocal characteristics, speaking style, and prosody, and a text prompt that describes the role, background information, and conversation context. These inputs are processed jointly to create a coherent persona. The model is built on the Moshi architecture and has 7 billion parameters, with a dual-stream configuration that allows listening and speaking to occur concurrently, enabling natural conversational dynamics.

PersonaPlex has been trained on a blend of real and synthetic conversations, including 7,303 real conversations from the Fisher English corpus and 39,322 synthetic assistant role conversations. The model demonstrates strong generalization to text prompts well outside its training distribution and maintains a persona coherent with the text prompt throughout extended interactions. PersonaPlex outperforms other conversational AI agents on conversational dynamics, response and interruption latency, and task adherence in both question-answering assistant and customer service roles.

Get more likes & reach the top of search results by adding this button on your site!

NVIDIA PersonaPlex

Key Features

Zero to AI Engineer

Subscribe to the AI Search Newsletter