SpecGan

Spectrogram generator using GANs

I started this project as part of my Bachelor Thesis: "Generator of graphic representations of phonic signals using GAN neural networks" and develop it for learning purpose. Goal of this project is to take advantage of CNN in generating new synthetic audio clips. To achieve this, dataset clips and generated clips are spectrograms or mel spectrograms. Project could potentially be used to in audio data augumentation process, to generate sounds used in games/movies/simulations.

Model

Model is based on https://www.tensorflow.org/tutorials/generative/dcgan. List of major changes made after experiments to improve results:

-add noise to discriminator

-normalize input to value from 0 to 1

-changed activation function from tanh to sigmoid

Datasets

Mozilla commonvoice - huamn speech with noise: https://commonvoice.mozilla.org/en

Recording of drums: https://www.hexawe.net/mess/200.Drum.Machines/

Speak Like a Dog dataset: https://drive.google.com/drive/folders/1TmG1yjc0_RLUX7U0ZJGLPVWkAwiSkSWY

(WIP) Part of female recording from vctk corpus: https://datashare.ed.ac.uk/handle/10283/3443

Data preprocessing

Steps taken to prepare tensor containing spectrograms from audio clips:

Trim silnce with threshold equal to 15dB
Split audio clips into 2s chunks
Create spectrogram/mel-spectrogram using python.librosa (set parameters to get desired size of spectrogram)
Save all spectrograms to tf.tensor
Normalize values to 0-1 (done in model code)

Results analysis

To track loss function tensorboard was added. Althought loss fucntion is less useful in GANs than in other architectures, it helps to identify convergence failure(not finding an equilibrium between the discriminator and the generator - one of them dominates the other)

To analyze quality of generated audio, function converting mel/spectrograms to audio was implemented using librosa.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
generated		generated
Data_preparation.ipynb		Data_preparation.ipynb
LICENSE.md		LICENSE.md
Model.ipynb		Model.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SpecGan

Model

Datasets

Data preprocessing

Results analysis

About

Releases

Packages

Languages

License

MrCogito/SpecGan

Folders and files

Latest commit

History

Repository files navigation

SpecGan

Model

Datasets

Data preprocessing

Results analysis

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages