SoftVC VITS Singing Voice Conversion
FreeSpeechAudio
The purpose of this project was to enable developers to have their beloved anime characteThe singing voice conversion model uses SoftVC content encoder to extract speech features from the source audio. These feature vectors are directly fed into VITS without the need for conversion to a text-based intermediate representation. As a result, the pitch and intonations of the original audio are preserved. Meanwhile, the vocoder was replaced with NSF HiFiGAN to solve the problem of sound interruption.