AI Voice Cloning

Note I do not plan on actively working on improvements/enhancements for this project, this is mainly meant to keep the repo in a working state in the case the original git.ecker goes down or necessary package changes need to be made.

That being said, some enhancements added compared to the original repo:

✔️ Possible to train in other languages

✔️ Hifigan added, allowing for faster inference at the cost of quality.

✔️ whisper-v3 added as a chooseable option for whisperx

✔️ Output conversion using RVC

This is a fork of the repo originally located here: https://git.ecker.tech/mrq/ai-voice-cloning. All of the work that was put into it to incoporate training with DLAS and inference with Tortoise belong to mrq, the author of the original ai-voice-cloning repo.

Setup

This repo works on Windows with NVIDIA GPUs and Linux running Docker with NVIDIA GPUs.

Windows Package (Recommended)

Optional, but recommended: Install 7zip on your computer: https://www.7-zip.org/
- If you run into any extraction issues, most likely it's due to your 7zip being out-of-date OR you're using a different extractor.
Head over to the releases tab and download the latest package on Hugging Face: https://github.com/JarodMica/ai-voice-cloning/releases/tag/v3.0
Extract the 7zip archive.
Open up ai-voice-cloning and then run start.bat

Alternative Manual Installation

If you are installing this manually, you will need:

Python 3.11: https://www.python.org/downloads/release/python-311/
Git: https://www.git-scm.com/downloads

Clone the repository

git clone https://github.com/JarodMica/ai-voice-cloning.git

Run the setup-cuda.bat file and it will start running through all of the python packages needed
- If you don't have python 3.11, it won't work and you'll need to go download it
After it finishes, run start.bat and this will start downloading most of the models you'll need.
- Some models are downloaded when you first use them. You'll incur additional downloads during generation and when training (for whisper). However, once they are finished, you won't ever have to download them again as long as you don't delete them. They are located in the models folder of the root.
(Optional) You can opt to install whisperx for training by running setup-whipserx.bat
- Check out the whisperx github page for more details, but it's much faster for longer audio files. If you're processing one-by-one with an already split dataset, it doesn't improve speeds that much.

Docker for Linux (or WSL2)

Linux Specific Setup

Make sure the latest nvidia drivers are installed: sudo ubuntu-drivers install
Install Docker your preferred way. One way to do it is to follow the official documentation here.
- Start by uninstalling the old versions
- Follow the "apt" repository installation method
- Check that everything is working with the "hello-world" container
If, when launching the voice cloning docker, you have an error message saying that the GPU cannot be used, you might have to install Nvidia Docker Container Toolkit.
- Install with the "apt" method
- Run the docker configuration command
  
  sudo nvidia-ctk runtime configure --runtime=docker
- Restart docker

Windows Specific Setup

Make sure your Nvidia drivers are up to date: https://www.nvidia.com/download/index.aspx

Install WSL2 in PowerShell with wsl --install and restart
Open PowerShell, type and enter ubuntu. It should now load you into wsl2
Remove the original nvidia cache key: sudo apt-key del 7fa2af80
Download CUDA toolkit keyring: wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-keyring_1.1-1_all.deb
Install keyring: sudo dpkg -i cuda-keyring_1.1-1_all.deb
Update package list: sudo apt-get update
Install CUDA toolkit: sudo apt-get -y install cuda-toolkit-12-4
Install Docker Desktop using WSL2 as the backend
Restart
If you wish to monitor the terminal remotely via SSH, follow this guide.
Open PowerShell, type ubuntu, then follow below

Building and Running in Docker

Open a terminal (or Ubuntu WSL)
Clone the repository: git clone https://github.com/JarodMica/ai-voice-cloning.git && cd ai-voice-cloning
Build the image with ./setup-docker.sh
Start the container with ./start-docker.sh
Visit http://localhost:7860 or remotely with http://<ip>:7860

If remote server cannot be reached, checkout this thread

You might also need to remap your local folders to the Docker folders. To do this, you must open the "start-docker.sh" script, and update some lines. For instance, if you want to find your generated audios easily, create a "results" folder in the root directory, and then in "start-docker.sh" add the line:

-v "your/custom/path:/home/user/ai-voice-cloning/results"

Instructions

Checkout the YouTube video:

Watch First: https://youtu.be/WWhNqJEmF9M?si=RhUZhYersAvSZ4wf

Watch Second (RVC update): https://www.youtube.com/watch?v=7tpWH8_S8es&t=504s

Everything is pretty much the same as before if you've used this repository in the past, however, there is a new option to convert text output using rvc. Before you can use it, you will need a trained RVC .pth file that you get from RVC or online, and then you will need to place it in models/rvc_models/. Both .index and .pth files can be placed in here and they'll show up correctly in their respective dropdown menus.

To enable rvc:

Check and enable Show Experimental Settings to reveal more options
Check and enable Run the outputter audio through RVC. You will now have access to parameters you could adjust in RVC for the RVC voice model you're using.

Updating Your Installation

Below are how you can update the package for the latest updates

Windows

NOTE: If there are major feature change, check the latest release to see if update_package.bat will work. If NOT, you will need to re-download and re-extract the package from Hugging Face.

Run the update_package.bat file
- It will clone the repo and will copy the src folder from the repo to the package.

Alternative Manual Installation

You should be able to navigate into the folder and then pull the repo to update it.

cd ai-voice-cloning
git pull

If there are large features added, you may need to delete the venv and the re-run the setup-cuda script to make sure there are no package issues

Linux via Docker

You should be able to navigate into the folder and then pull the repo to update it, then rebuild your Docker image.

cd ai-voice-cloning
git pull
./setup-docker.sh

Documentation

Troubleshooting Manual Installation

The terminal is your friend. Any errors or issues will pop-up in the terminal when you go to try and run, and then you can start debugging from there.

If somewhere in the process, torch gets messed up, you may have to reinstall it. You will have to uninstall it, then reinstall it like the following. Make sure to type (Y) to confirm deletion.

.\venv\Scripts\activate.bat
pip uninstall torch
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

Bug Reporting

If you run into any problems, please open up a new issue on the issues tab.

Tips for developers

setup-cuda.bat should have everything that you need for the packages to be installed. All of the different requirements files make it quite a mess in the script, but each repo has their requirements installed, and then at the end, the requirements.txt in the root is needed to change the version back to compatible versions for this repo.

Name		Name	Last commit message	Last commit date
Latest commit History 85 Commits
bin		bin
config		config
models		models
modules		modules
src		src
training		training
voices		voices
.dockerignore		.dockerignore
.gitignore		.gitignore
.gitmodules		.gitmodules
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
changelog.md		changelog.md
deepspeed-0.8.3+6eca037c-cp39-cp39-win_amd64.whl		deepspeed-0.8.3+6eca037c-cp39-cp39-win_amd64.whl
download_ffmpeg.bat		download_ffmpeg.bat
notebook_colab.ipynb		notebook_colab.ipynb
notebook_paperspace.ipynb		notebook_paperspace.ipynb
reload_flag.txt		reload_flag.txt
requirements.txt		requirements.txt
setup-cuda-bnb.bat		setup-cuda-bnb.bat
setup-cuda-cpu.bat		setup-cuda-cpu.bat
setup-cuda-cpu.sh		setup-cuda-cpu.sh
setup-cuda.bat		setup-cuda.bat
setup-cuda.sh		setup-cuda.sh
setup-directml.bat		setup-directml.bat
setup-docker.sh		setup-docker.sh
setup-rocm-bnb.sh		setup-rocm-bnb.sh
setup-rocm.sh		setup-rocm.sh
setup-whisperx.bat		setup-whisperx.bat
start-docker.sh		start-docker.sh
start.bat		start.bat
start.sh		start.sh
train-docker.sh		train-docker.sh
train.bat		train.bat
train.sh		train.sh
update-force.bat		update-force.bat
update-force.sh		update-force.sh
update.bat		update.bat
update.sh		update.sh
update_package.bat		update_package.bat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Voice Cloning

Setup

Windows Package (Recommended)

Alternative Manual Installation

Docker for Linux (or WSL2)

Linux Specific Setup

Windows Specific Setup

Building and Running in Docker

Instructions

Updating Your Installation

Windows

Alternative Manual Installation

Linux via Docker

Documentation

Troubleshooting Manual Installation

Bug Reporting

Tips for developers

About

Releases 3

Packages

Contributors 4

Languages

License

JarodMica/ai-voice-cloning

Folders and files

Latest commit

History

Repository files navigation

AI Voice Cloning

Setup

Windows Package (Recommended)

Alternative Manual Installation

Docker for Linux (or WSL2)

Linux Specific Setup

Windows Specific Setup

Building and Running in Docker

Instructions

Updating Your Installation

Windows

Alternative Manual Installation

Linux via Docker

Documentation

Troubleshooting Manual Installation

Bug Reporting

Tips for developers

About

Resources

License

Stars

Watchers

Forks

Releases 3

Packages 0

Contributors 4

Languages

Packages