Skip to content

zero-shot keyword spotting with KWS test dataset using ImageBind

License

Notifications You must be signed in to change notification settings

reddiedev/197z-kws

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

EEE 197z Project 2 - Zero-Shot KWS using ImageBind

an algorithm using ImageBind that will classify KWS test dataset (Google Speech Commands v2 35) in zero-shot manner.

Author: Sean Red Mendoza | 2020-01751 | [email protected]

Tools/ References

Goals

  • randomly pick an audio from the test split and classify it (audio player in UI)
  • user should be able to record his/her own voice for testing (audio recorder in UI, powered by Gradio)
  • show summary statistics during evaluation of n sampels (# of data points, accuracy).
  • comparison table of SOTA model scores

Usage

  1. Duplicate this repository on a working directory
git clone https://github.com/reddiedev/197z-kws
cd 197z-kws
  1. Prepare environment for running the notebook
conda create --name kws
conda activate kws

sudo apt install ffmpeg

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip jupyter jupyterlab ipywidgets==7.6.5 install numpy ipython gradio ipywebrtc notebook

jupyter labextension install jupyter-webrtc 
  1. Run the demo.ipynb jupyter notebook

  2. View SOTA models comparison in comparison.md

Acknowledgements

Professor, Rowel Atienza