Stable Audio Open


Key Features

  •  Produces stereo audio at 44.1kHz, up to 47 seconds in length.
  •  Accessible on Hugging Face for community use.
  •  Utilizes an autoencoder, T5-based text embedding, and a transformer-based diffusion model.
  •  Trained on nearly 500,000 recordings from Freesound and the Free Music Archive.
  •  Suitable for sound design, ambient sounds, sample creation, audio branding, and academic projects.
  •  Runs efficiently on consumer-grade GPUs, such as A6000 GPUs for local training.
  •  Can be fine-tuned to meet specific needs in various industries and creative projects.


Get more likes & reach the top of search results by adding this button on your site!

Embed button preview - Light theme
Embed button preview - Dark theme

Subscribe to the AI Search Newsletter

Get top updates in AI to your inbox every weekend. It's free!