Key Features

Streaming and infinite-length generation
Real-time performance with low latency
Vividness and generalization
Highly responsive interaction capabilities
Robust to various character and motion styles
Outperforms state-of-the-art methods
Enables infinite durations for seamless interaction
Suitable for livestreaming and video conferencing

The model delivers exceptional behavioral vividness and perceptual realism, capturing subtle human nuances for natural transitions across complex interactive states. It maintains high-fidelity synthesis across diverse character styles from a single reference image. FlowAct-R1 consists of training and inference stages, including converting base full-attention DiT to streaming AR model via autoregressive adaptation and joint audio-motion finetuning for better lip-sync and body motion.


FlowAct-R1 exhibits highly responsive interaction capabilities, demonstrating significant potential to empower real-time, low-latency instant communication scenarios. It is robust to various character and motion styles, and outperforms state-of-the-art methods in human preference evaluation. The framework enables infinite durations for truly seamless interaction, making it suitable for applications such as livestreaming and video conferencing. It achieves real-time streaming, infinite-duration generation, and superior behavioral naturalness.

Get more likes & reach the top of search results by adding this button on your site!

Embed button preview - Light theme
Embed button preview - Dark theme
TurboType Banner
Zero to AI Engineer Program

Zero to AI Engineer

Skip the degree. Learn real-world AI skills used by AI researchers and engineers. Get certified in 8 weeks or less. No experience required.

Subscribe to the AI Search Newsletter

Get top updates in AI to your inbox every weekend. It's free!