Step Audio EditX

Free Audio Speech Editing

LikeWebsite Promote

Key Features

3 billion parameter LLM optimized for expressive and iterative audio editing

Dual codebook tokenizer capturing linguistic and prosody/emotion information

Trained on 200,000 hours of speech data for naturalness and timbre accuracy

Supports zero-shot TTS and flexible editing through natural language instructions

Post-training includes supervised fine-tuning and reinforcement learning optimization

Open-source with code and checkpoints available for developer customization

Enables editing of voice recordings at the token level without re-recording

Can improve speech from closed-source TTS systems and integrates into workflows

The model harnesses a dual training approach, starting with supervised fine-tuning to align the system for zero-shot TTS and editing tasks in a chat-like prompt format, followed by reinforcement learning via Proximal Policy Optimization to enhance control fidelity. It was trained on approximately 200,000 hours of high-quality speech data which improves its naturalness, pronunciation, and timbre similarity. Step Audio EditX stands out by handling discrete audio tokens and performing edits in a way that feels as direct and intuitive as rewriting text, making it a breakthrough in controllable speech synthesis and post-processing of audio from closed-source TTS systems.

The open-source release of Step Audio EditX offers significant benefits for content creators, marketers, and developers who need high-flexibility audio editing tools. For podcasters, advertisers, or video producers, it enables post-production adjustments like making a sentence calmer, adding pauses, or altering speaker emotion after recording. For engineers and founders, it can be integrated into content creation pipelines, dubbing workflows, or conversational AI solutions, supporting local fine-tuning and rapid deployment without licensing constraints. The model's innovative design and accessible architecture democratize expressive audio editing and reduce barriers to experimentation in audio AI research.

Get more likes & reach the top of search results by adding this button on your site!

Step Audio EditX

Key Features

Zero to AI Engineer

Subscribe to the AI Search Newsletter