The model is an IC-LoRA fine-tune of the LTX-2.3 3.3B audio-only model, using a diffusion transformer with flow matching and conditioning from Gemma 3 12B text embeddings. This architecture allows the model to operate beyond conventional neutral TTS by producing expressive performance cues and dramatic shifts. It is built on LTX-2 under the LTX-2 Community License and includes model, demo space, and code links.
Dramabox is useful for audio drama, games, animation, expressive voice assistants, character prototyping, and synthetic dialogue datasets. Its main value is controllability: a user can write a prompt that specifies not only what is said but how it is performed. Because the Hugging Face page exposes model resources and code links, this listing marks it as a free open-source audio model.


