Key Features

2B-parameter fully continuous autoregressive text-to-speech system.
Uses a semantic encoder, LLM, and autoregressive flow-matching acoustic head.
Runs over a 48 kHz AudioVAE with no discrete tokens in the pipeline.
Reports strong WER and speaker-similarity metrics on Chinese, English, and multilingual benchmarks.
Supports monolingual and cross-lingual voice cloning examples.
Includes context-aware expressive voice cloning demonstrations.
Provides GitHub, Hugging Face collection, and Hugging Face Space links.
Released with an Apache-2.0 license link on the project page.

The project page emphasizes strong benchmark results across Chinese, English, hard Chinese evaluation, multilingual speaker similarity, voice cloning, and emotional expressiveness. It includes audio samples for monolingual, cross-lingual, and context-aware expressive voice cloning.


dots.tts is useful for speech AI researchers and developers who want an open high-quality TTS system with voice cloning and multilingual capabilities. Public GitHub, Hugging Face collection, and demo-space links make it practical to inspect model assets and try examples.

Get more likes & reach the top of search results by adding this button on your site!

Embed button preview - Light theme
Embed button preview - Dark theme
TurboType Banner
Zero to AI Engineer Program

Zero to AI Engineer

Skip the degree. Learn real-world AI skills used by AI researchers and engineers. Get certified in 8 weeks or less. No experience required.

Subscribe to the AI Search Newsletter

Get top updates in AI to your inbox every weekend. It's free!