Audio inpainting with generative adversarial network

This is the repository for the DAS project Audio inpainting with Generative Adversarial Networks. In this project the basic Wasserstein Generative Adversarial Network (WGAN) is compared with a new proposed WGAN architecture using a short-range and a long range neighboring borders to improve the inpainting part. The focus are on gaps in the range of 500ms using three different dataset: PIANO, SOLO and MAESTRO. Detailed information about the project and the dataset can be found in tex/report/report.pdf or https://arxiv.org/abs/2003.07704. We demonstrate a few samples here https://blogs.ethz.ch/web-audio-inpainting-gan/

Please keep this repository as clean as possible and do not commit data nor notebook with excecuted cells.

How to use the code in this projects

Go the folder code
```
cd code
```

Initialize submodules

git submodule update --init --recursive

Install package (make a virtual environnement first)
```
pip install -r requirements.txt
```
You may want to use the nogpu version of the packages (requirements_nogpu.txt) for you local computer.

Download and train 'PIANO' dataset

Go to folder
```
cd code
```
Download and make 'PIANO' dataset (http://deepyeti.ucsd.edu/cdonahue/wavegan/data/mancini_piano.tar.gz)
```
python download_data.py
python make_piano_dataset.py
```
Go to folder
```
cd code/experiments
```

Training basic and extend WGAN model

python myexperiments-basic-piano.py
python myexperiments-extend-piano.py

Download and train 'SOLO' dataset

Go to folder
```
cd code
```
Download 'SOLO' dataset (https://www.kaggle.com/zhousl16/solo-audio)
Make the 'SOLO' dataset
```
python make_solo_dataset.py
```
Go to folder
```
cd code/experiments
```

Training basic and extend WGAN model

python myexperiments-basic-solo.py
python myexperiments-extend-solo.py

Download and train 'MAESTRO' dataset

Go to folder
```
cd code
```
Download 'MAESTRO' dataset (https://magenta.tensorflow.org/datasets/maestro)
```
python download_data_maestro.py
```
Go to folder
```
cd code/experiments
```

Training basic and extend WGAN model

python myexperiments-basic-maestro.py
python myexperiments-extend-maestro.py

Testing the trained models

Go to folder
```
cd code/experiments
```
Run test script (make sure that the path to the trained model is correct)
```
python myexperiments-test-model.py
```

Project general informations

Students: Ebner Pirmin, Amr Eltelt
Supervisor: Nathanaël Perraudin

Previous work

Previous work on audio inptainting

Deep learning based methods
- A context encoder for audio inpainting: https://arxiv.org/pdf/1810.12138.pdf
Non deeplearning methods
- Audio declipping with social sparsity: https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=6853863
- LPC : http://ant-s4.hsu-hh.de/dafx/papers/DAFX02_Kauppinen_Roth_signal_extrapolation.pdf
Engineering methods
- Inpainting of long audio segments with similaritygraphs: https://arxiv.org/pdf/1607.06667.pdf
- You can also check the demo: https://lts2.epfl.ch/web-audio-inpainting/
More related work: Check the related work setion of
- https://arxiv.org/pdf/1810.12138.pdf
- https://arxiv.org/pdf/1607.06667.pdf

Previous work on audio generation using GAN

Mostly you need to be aware of

WaveGAN https://arxiv.org/pdf/1802.04208.pdf, code at https://github.com/chrisdonahue/wavegan.
TiFGAN https://arxiv.org/pdf/1902.04072.pdf, code https://github.com/tifgan/stftGAN, website https://tifgan.github.io/
GANSythn (Maybe) https://magenta.tensorflow.org/gansynth

Important architectures for audio generation

Wavenet: https://deepmind.com/blog/article/wavenet-generative-model-raw-audio
Many papers...

Data sources

Bach piano performances
The Free Music Archive
...

These datasets are probably not going to work because the audio snipets are too short... To check

Speech Commands Zero through Nine (SC09)
Drum sound effects
Nsynth dataset

Code sources

The main inspiration for the code is: https://github.com/nperraud/CodeGAN
Notebook to start working: https://github.com/nperraud/CodeGAN/blob/audio-inpainting/audio_experiment/GAN-audio-inpainting.ipynb
Main git used as a submodule for gan https://github.com/nperraud/gantools/

Executing code at CSCS

Some help to execute code on CSCS

Global setup and access to CSCS
Python code execution
Checking for the list of jobs: squeue -u $USER -l
Storage on CSCS
- 10 Gb (max 10'000 files) in home
- $SCRATCH Unlimited space and files but autodelete after 30 days of not being used

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Audio inpainting with generative adversarial network

How to use the code in this projects

Download and train 'PIANO' dataset

Download and train 'SOLO' dataset

Download and train 'MAESTRO' dataset

Testing the trained models

Project general informations

Previous work

Previous work on audio inptainting

Previous work on audio generation using GAN

Important architectures for audio generation

Data sources

Code sources

Executing code at CSCS

Files

README.md

Latest commit

History

README.md

File metadata and controls

Audio inpainting with generative adversarial network

How to use the code in this projects

Download and train 'PIANO' dataset

Download and train 'SOLO' dataset

Download and train 'MAESTRO' dataset

Testing the trained models

Project general informations

Previous work

Previous work on audio inptainting

Previous work on audio generation using GAN

Important architectures for audio generation

Data sources

Code sources

Executing code at CSCS