SpeakerRecognition_tutorial

A pytorch implementation of d-vector based speaker recognition system.
All the features for training and testing are uploaded.

Requirements

python 3.5+
pytorch 1.0.0
pandas 0.23.4
numpy 1.13.3
pickle 4.0
matplotlib 2.1.0

Datasets

We used the dataset collected through the following task.

No. 10063424, 'development of distant speech recognition and multi-task dialog processing technologies for in-door conversational robots'

Specification

Korean read speech corpus (ETRI read speech)
Clean speech at a distance of 1m and a direction of 0 degrees
16kHz, 16bits

We uploaded 40-dimensional log mel filterbank energy features extracted from the above dataset.
python_speech_features library is used.

* Train

24000 utterances, 240 folders (240 speakers)
Size : 3GB
feat_logfbank_nfilt40 - train

* Enroll & test

20 utterances, 10 folders (10 speakers)
Size : 11MB
feat_logfbank_nfilt40 - test

Usage

1. Training

Background model (ResNet based speaker classifier) is trained.
You can change settings for training in 'train.py' file.

python train.py

2. Enrollment

Extract the speaker embeddings (d-vectors) using 10 enrollment speech files.
They are extracted from the last hidden layer of the background model.
All the embeddings are saved in 'enroll_embeddings' folder.

python enroll.py

3. Testing

For speaker verification, you can change settings in 'verification.py' file.

python verification.py

For speaker identification, you can change settings in 'identification.py' file.

python identification.py

Author

Youngmoon Jung ([email protected]) at KAIST, South Korea

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
__pycache__		__pycache__
enroll_embeddings		enroll_embeddings
enroll_embeddings_cpu		enroll_embeddings_cpu
feat_logfbank_nfilt40		feat_logfbank_nfilt40
model		model
model_saved		model_saved
test_wavs		test_wavs
2019_LG_SpeakerRecognition_tutorial.pdf		2019_LG_SpeakerRecognition_tutorial.pdf
DB_wav_reader.py		DB_wav_reader.py
README.md		README.md
SR_Dataset.py		SR_Dataset.py
configure.py		configure.py
enroll.py		enroll.py
identification.py		identification.py
loss_plot.png		loss_plot.png
train.py		train.py
verification.py		verification.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SpeakerRecognition_tutorial

Requirements

Datasets

* Train

* Enroll & test

Usage

1. Training

2. Enrollment

3. Testing

Author

About

Releases

Packages

Languages

943274923/SpeakerRecognition_tutorial

Folders and files

Latest commit

History

Repository files navigation

SpeakerRecognition_tutorial

Requirements

Datasets

* Train

* Enroll & test

Usage

1. Training

2. Enrollment

3. Testing

Author

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages