The current repository is a copy of the speaker recognition challenge by Radboud University. My ResNet/ResNeXt implementation for this challenge is found in the skeleton/layers/resnet.py
.../resnet_block.py
and resnext_block.py
files. Other changes made such as parameters, layer/block structure and training cycling can be found in the skeleton/models/prototype.py
file. The original code is written to work on a cluster of Radboud University in Nijmegen through GitLab.
The accompanying project paper is found under project_paper.pdf
.
Welcome to the speaker recognition challenge of the Machine Learning in Practice course! We hope you will have fun playing around with training and iterating on machine learning models, learning to use the Data Science GPU cluster and Linux command line environment as well as outperforming the other teams! :)
The goal of this challenge is to build an Automatic Speaker Recognition system and make it perform optimally, given the conditions of this challenge.
We have provided some skeleton code for training a neural network based system, because neural networks lie at the base of the current state of the art. In order to improve on this baseline system, you are free to
- experiment with the neural architecture
- experiment with different features, data augmentation and normalization techniques
- experiment with other models than neural networks
- experiment with combining systems
You are limited in the following aspects:
- you cannot use additional speaker training data
- you cannot use existing automatic speaker recognition code
- you cannot use pre-trained models for automatic speaker recognition
We have the following documentation for you ready, to give you some background information about the challenge, guide you through the process of working with the compute cluster, and explain the skeleton code you find in this repository.
- Honour code Here you find the basic rules you have to adhere to, in order to make the challenge fair, so that you don't fail the course...
- Automatic Speaker Recognition Here you find some background on what Automatic Speaker Recognition is, and how the quality of an automatic speaker recognition is assessed.
- Project details Here you will find information about the way the data is structured, and where it can be found
- Compute cluster Here you can find instructions on how to work with the compute cluster
- Forking clone! This section details the easiest way to fork this repository and clone the code so that you can work with it
- Skeleton This is the place where the layout of the skeleton code that you find in this repository is explained