diff --git a/README.md b/README.md index 9b80205..66967ff 100644 --- a/README.md +++ b/README.md @@ -1,22 +1,23 @@ # TIDIGITS recipe -This repository contains a _recipe_ for training an ASR system using the [TIDIGITS database](https://catalog.ldc.upenn.edu/LDC93S10). +This repository contains a _recipe_ for training an automatic speech recognition (ASR) system using the [TIDIGITS database](https://catalog.ldc.upenn.edu/LDC93S10). The recipe is entirely Julia-flavoured and uses following packages (among others): * [Flux](https://github.com/FluxML/Flux.jl) as ML library * [FiniteStateTransducers](https://github.com/idiap/FiniteStateTransducers.jl) for WFST compositions * [HMMGradients](https://github.com/idiap/HMMGradients.jl) for maximum likelihood training -Currently the training runs only on CPU and employs a simple greedy decoder. Stay tuned for more! +Currently the training runs only on CPU and employs a simple greedy decoder. -### Installation +## Installation -Set in your environment the path `TIDIGITS_PATH=\your\path\to\tidigits`. -If you're using SGE set the command flags in `CPU_CMD`, i.e. the queue options. +Run `julia --project -e 'using Pkg; Pkg.instantiate()'` to install all the dependencies. +For the live demo install [sox](http://sox.sourceforge.net/). -This can be done e.g. by running `source env.sh` before lunching Julia, where `env.sh` is a script that export these variables. -Alternatively, the environment variables can be specified [directly in the REPL](https://docs.julialang.org/en/v1/manual/environment-variables/). +## Live Demo -Run `julia --project -e 'using Pkg; Pkg.instantiate()'` to install all the dependencies. +Open a Julia terminal with `julia --project` and type `include("demo.jl")` to try out the ASR with your own voice. A model trained with configuration `2b` (see below) is already present in this repository. + +## Training ### Configuration @@ -26,11 +27,18 @@ This folder must contain the following files: * `feat_conf.jl` for feature extraction * `model_conf.jl` for model and optimisation parameters (hyperparameters) A couple of setups are present in this repository for reference in the folder `conf`. +Currently a TDNN/ConvNet is used as acoustic model. ### Data preparation +Set in your shell environment the path `TIDIGITS_PATH=\your\path\to\tidigits`. +If you're using SGE set the command flags in `CPU_CMD`, i.e. the queue options. + +This can be done e.g. by running `source env.sh` before lunching Julia, where `env.sh` is a script that export these variables. +Alternatively, the environment variables can be specified [directly in the REPL](https://docs.julialang.org/en/v1/manual/environment-variables/). + Run `julia --project prepare_data.jl --conf 2a` to extract feature and prepare training data using the configuration `2a`. -Features and transctiptions will be saved in the folder `data/uuid/`. +Features and transcriptions will be saved in the folder `data/uuid/`. Here `uuid` is linked to `feat_conf.jl` file, meaning that if you create a new `model_conf.jl` without modifying feature extraction you don't need to run data preparation twice. If SGE grid is available add the flag `--nj N` to split the work into `N` jobs. @@ -51,7 +59,3 @@ Modify the `conf` by changing the default in the `ArgParse` table. ### Evaluation Run the script `eval.jl` to calculate Word Error Rates (WER) and Phone Error Rate (PER). - -### Demo - -A live demo can be used by running `demo.jl` (requires [sox](http://sox.sourceforge.net/) to be installed in your system). diff --git a/models/2a/best_modely_final.bson b/models/2a/best_modely_final.bson new file mode 100644 index 0000000..c6ffc90 Binary files /dev/null and b/models/2a/best_modely_final.bson differ