CycleGAN-VC PyTorch

This is a reimplementation of CycleGAN-VC in PyTorch written by Bence Halpern. Additional work on UASpeech enhancement was done by Luke Prananta.

Code is available under GPL-3 license

Installation

Main requirements:

Python 3.5
nnmnkwii 0.0.20
PyTorch 1.5.0
librosa 0.7.2
PyWorld 0.2.10

To create the environment use the following command:

conda env create -f environment.yml

Downloading the VCC2016 dataset is done by first running the following command:

python download.py

The model can be trained with the following command:

python train.py

Conversion of the utterances can be done through the convert.py file, more detail on the help

Convert CycleGAN utterance

optional arguments:
  -h, --help            show this help message and exit
  --file_path FILE_PATH
                        Path of speech file to convert.
  --output_dir OUTPUT_DIR
                        Output directory for converted voice.
  --data_root DATA_ROOT
                        VCC 2016 dataroot
  --domain_A            Check if converting from domain A

Short summary of what each file does

├── analysis.py - ?
├── clean_uaspeech.py - Generates the "cleaned" version of the UA Speech dataset (assumes data is denoised)
├── convert.py - Convert CycleGAN utterance
├── cyclegan.py - CycleGAN model
├── data_utils.py - Data utils 
├── download.py - Downloads the VCC2020 dataset
├── environment.yml - Conda environment file (TODO: Needs to be checked)
├── evaluation.py - 
├── f0_wrapper.py - 
├── mcep_wrapper.py -
├── modules.py  - 
├── NOTES.MD - Notes that Bence has written down during the implementation
├── preprocess.py - Preprocesses the VCC2020 dataset?
├── preprocess_test.py - 
├── preprocess_ts.py - 
├── README.md
├── requirements.txt
├── speaker_wordlist.xls
├── speedup.py
├── test.py
├── train_with_test.py
├── uaspeech.py
├── utils.py

How do I retrain it with my own data?

The current way of training is through nnmnkwii FileDataSource wrapper, see the data_utils.py for an example. This can be surprisingly efficient, because many standard speech datasets have a nanamin wrapper, so it essentially requires you to inherit the right dataset class and provide the right dataset path.

You can also implement custom wrapper for your own datasets. Please have a thorough look at relevant documentation of nnmnkwii here.

I might implement friendlier support for custom retraining in the future.

Do you have a pretrained model?

Yes, of course. You can download it from my Google Drive folder. Make a folder called checkpoint and put these there.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CycleGAN-VC PyTorch

Installation

Short summary of what each file does

How do I retrain it with my own data?

Do you have a pretrained model?

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
.gitignore		.gitignore
NOTES.MD		NOTES.MD
README.md		README.md
analysis.py		analysis.py
clean_uaspeech.py		clean_uaspeech.py
convert.py		convert.py
data_utils.py		data_utils.py
download.py		download.py
environment.yml		environment.yml
evaluation.py		evaluation.py
f0_wrapper.py		f0_wrapper.py
mcep_wrapper.py		mcep_wrapper.py
preprocess.py		preprocess.py
preprocess_test.py		preprocess_test.py
preprocess_ts.py		preprocess_ts.py
requirements.txt		requirements.txt
speaker_wordlist.xls		speaker_wordlist.xls
speedup.py		speedup.py
test.py		test.py
train_with_test.py		train_with_test.py
uaspeech.py		uaspeech.py
utils.py		utils.py

karkirowle/cyclegan_pytorch

Folders and files

Latest commit

History

Repository files navigation

CycleGAN-VC PyTorch

Installation

Short summary of what each file does

How do I retrain it with my own data?

Do you have a pretrained model?

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages