Dramabox

NEW

Free Audio Open-Source

LikeWebsite Promote

Key Features

Generates expressive TTS with prompt-controlled emotion and delivery.

Supports optional 10-second voice references for voice cloning.

Controls laughs, sighs, breaths, pauses, transitions, and performance style.

Built as an IC-LoRA fine-tune of the LTX-2.3 3.3B audio-only model.

Uses a diffusion transformer with flow matching for audio generation.

Conditions generation on Gemma 3 12B text embeddings.

Provides Hugging Face model, demo Space, and GitHub code resources.

Targets audio drama, games, animation, character speech, and expressive assistants.

The model is an IC-LoRA fine-tune of the LTX-2.3 3.3B audio-only model, using a diffusion transformer with flow matching and conditioning from Gemma 3 12B text embeddings. This architecture allows the model to operate beyond conventional neutral TTS by producing expressive performance cues and dramatic shifts. It is built on LTX-2 under the LTX-2 Community License and includes model, demo space, and code links.

Dramabox is useful for audio drama, games, animation, expressive voice assistants, character prototyping, and synthetic dialogue datasets. Its main value is controllability: a user can write a prompt that specifies not only what is said but how it is performed. Because the Hugging Face page exposes model resources and code links, this listing marks it as a free open-source audio model.

Get more likes & reach the top of search results by adding this button on your site!

Dramabox

Key Features

Zero to AI Engineer

Subscribe to the AI Search Newsletter