EEE 197z Project 2 - Zero-Shot KWS using ImageBind

an algorithm using ImageBind that will classify KWS test dataset (Google Speech Commands v2 35) in zero-shot manner.

Author: Sean Red Mendoza | 2020-01751 | [email protected]

Tools/ References

Goals

randomly pick an audio from the test split and classify it (audio player in UI)
user should be able to record his/her own voice for testing (audio recorder in UI, powered by Gradio)
show summary statistics during evaluation of n sampels (# of data points, accuracy).
comparison table of SOTA model scores

Usage

Duplicate this repository on a working directory

git clone https://github.com/reddiedev/197z-kws
cd 197z-kws

Prepare environment for running the notebook

conda create --name kws
conda activate kws

sudo apt install ffmpeg

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip jupyter jupyterlab ipywidgets==7.6.5 install numpy ipython gradio ipywebrtc notebook

jupyter labextension install jupyter-webrtc

Run the demo.ipynb jupyter notebook
View SOTA models comparison in comparison.md

Acknowledgements

Professor, Rowel Atienza

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.assets		.assets
.vscode		.vscode
bpe		bpe
models		models
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
comparison.md		comparison.md
data.py		data.py
demo.ipynb		demo.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EEE 197z Project 2 - Zero-Shot KWS using ImageBind

Tools/ References

Goals

Usage

Acknowledgements

About

Languages

License

reddiedev/197z-kws

Folders and files

Latest commit

History

Repository files navigation

EEE 197z Project 2 - Zero-Shot KWS using ImageBind

Tools/ References

Goals

Usage

Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks

Languages