This is the inference code of GPT-SoVITS that can be developer-friendly.
Check out the example notebook for a quick start. Or open it in Colab
As we all know, the dependencies of an AI project are always a mess. Here is how I prepare the environment for this project:
Conda (Linux)
conda install python=3.10
conda install pytorch=2.1 torchvision torchaudio pytorch-lightning pytorch-cuda=12.1 -c pytorch -c nvidia
conda install ffmpeg=6.1.1 -c conda-forge
MacOS
brew install ffmpeg
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
pip3 install pytorch-lightning
pip3 install GPT-SoVITS-Infer
You can also try to prepare the environment with cpu only options, which should work, but I have not tested it yet.
After the environment is ready, you can install the package by pip:
pip install GPT-SoVITS
I do not add the packages related to torch to the dependencies of GPT-SoVITS-Infer. Check if the environment is ready if things go wrong.
GPTSoVITSInference.load_sovits
andGPTSoVITSInference.load_gpt
: You can load your own fine-tuned model by the methods.GPTSoVITSInference.set_prompt_audio
: Set the prompt audio for the inference.GPTSoVITSInference.get_tts_wav_stream
: Return a generator that yields the audio pieces of the generated audio. It will create a background thread to generate the audio, so you can get the audio pieces while the audio is still being generated.