Keyword spotter

A simple project on speech recognition.

Sebastian Thomas (datascience at sebastianthomas dot de)

In this project, we intend to recognize a keyword out of a list of ten given keywords.

It is an extension of the introductory tutorial on speech command recognition from Tensorflow.

It uses the speech_commands dataset of Pete Warden, version 0.0.2. The dataset contains 105829 WAV files, each of a duration of at most 1 second. Each file consists of a spoken command out of a list of 35 commands.

For demonstration purposes, a REST API was implemented. This was inspired by a tutorial of Velardo of his series Deep Learning (Audio) Application: From Design to Deployment.

Content

Data mining, analysis, training and evaluation of the classifier:

Predictive analysis

Main development:

REST API:

Future work

tune more hyperparameters
use class weights for training (we have imbalanced classes)
add background noise to the instances
use other form of data augmentation such as e.g time shifting
add a silence label
consider other classifier models

References

Warden, Pete: Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition. arXiv:1804.03209, 2018.

Velardo, Valerio: Deep Learning (Audio) Application: From Design to Deployment. YouTube, 2020.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
common		common
models		models
resources		resources
README.md		README.md
client.py		client.py
keyword_spotter.py		keyword_spotter.py
predictive_analysis.ipynb		predictive_analysis.ipynb
server.py		server.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Keyword spotter

Content

Future work

References

About

Languages

SebastianThomas1/keyword_spotter

Folders and files

Latest commit

History

Repository files navigation

Keyword spotter

Content

Future work

References

About

Topics

Resources

Stars

Watchers

Forks

Languages