Replies: 5 comments
-
>>> othiele |
Beta Was this translation helpful? Give feedback.
-
>>> lissyx |
Beta Was this translation helpful? Give feedback.
-
>>> lissyx |
Beta Was this translation helpful? Give feedback.
-
>>> rajpuneet.sandhu |
Beta Was this translation helpful? Give feedback.
-
>>> othiele |
Beta Was this translation helpful? Give feedback.
-
>>> rajpuneet.sandhu
[January 12, 2021, 9:42pm]
Using Linux and deepspeech-gpu, I trained a model with 0.9.3 with the
following command:
python3 DeepSpeech.py slash
--alphabet_config_path data/alphabet.txt slash
--beam_width 32 slash
--checkpoint_dir $ckpt_dir slash
--export_dir $ckpt_dir slash
--scorer $scorer_path slash
--n_hidden 128 slash
--learning_rate 0.0001 slash
--lm_alpha 0.75 slash
--lm_beta 1.85 slash
--train_batch_size 6 slash
--dev_batch_size 6 slash
--test_batch_size 6 slash
--report_count 10 slash
--epochs 500 slash
--noearly_stop slash
--noshow_progressbar slash
--export_tflite slash
--train_files /datasets/deepspeech_wakeword_dataset/wakeword-train.csv, slash
/datasets/deepspeech_wakeword_dataset/wakeword-train-other-accents.csv, slash
/datasets/deepspeech_wakeword_dataset/wakeword-train.csv, slash
/datasets/india_portal_2may2019-train.csv, slash
/datasets/india_portal_2to9may2019-train.csv, slash
/datasets/india_portal_9to19may2019-train.csv, slash
/datasets/india_portal_19to24may2019-train.csv, slash
/datasets/brazil_portal_20to26june2019-wakeword-train.csv, slash
/datasets/brazil_portal_26juneto3july2019-wakeword-train.csv, slash
/datasets/japan_portal_3july2019-wakeword-train.csv, slash
/datasets/mixed_portal_backups_14_16_17_18_19_visteon_wakeword_dataset-train.csv, slash
/datasets/alexa-train.csv, slash
/datasets/alexa-polly-train.csv, slash
/datasets/alexa-sns.csv, slash
/datasets/india_portal_ww_data_04282020/custom_train.csv, slash
/datasets/india_portal_ww_data_05042020/custom_train.csv, slash
/datasets/india_portal_ww_data_05222020/custom_train.csv, slash
/datasets/india_portal_ww_data_augmented_04282020/custom_train.csv, slash
/datasets/india_portal_ww_data_augmented_04282020/custom_test.csv, slash
/datasets/india_portal_ww_data_augmented_05042020/custom_train.csv, slash
/datasets/india_portal_ww_data_augmented_05042020/custom_test.csv, slash
/datasets/ww_gtts_data_google_siri/custom_train.csv, slash
/datasets/ww_gtts_data_google_siri/custom_dev.csv, slash
/datasets/ww_polly_data_google_siri/custom_train.csv, slash
/datasets/ww_polly_data_google_siri/custom_test.csv slash
--dev_files /datasets/deepspeech_wakeword_dataset/wakeword-dev.csv, slash
/datasets/india_portal_2may2019-dev.csv, slash
/datasets/india_portal_2to9may2019-dev.csv, slash
/datasets/india_portal_9to19may2019-dev.csv, slash
/datasets/india_portal_19to24may2019-dev.csv, slash
/datasets/brazil_portal_20to26june2019-wakeword-dev.csv, slash
/datasets/brazil_portal_26juneto3july2019-wakeword-dev.csv, slash
/datasets/mixed_portal_backups_14_16_17_18_19_visteon_wakeword_dataset-dev.csv, slash
/datasets/alexa-dev.csv, slash
/datasets/india_portal_ww_data_augmented_04282020/custom_dev.csv, slash
/datasets/india_portal_ww_data_augmented_05042020/custom_dev.csv, slash
/datasets/india_portal_ww_data_05222020/custom_dev.csv, slash
/datasets/ww_gtts_data_google_siri/custom_dev.csv, slash
/datasets/ww_polly_data_google_siri/custom_dev.csv, slash
/datasets/india_portal_ww_data_augmented_04282020/custom_dev.csv, slash
/datasets/india_portal_ww_data_augmented_05042020/custom_dev.csv slash
--test_files /datasets/alexa-sns.csv, slash
/datasets/india_portal_ww_data_04282020/custom_dev.csv, slash
/datasets/india_portal_ww_data_04282020/custom_test.csv, slash
/datasets/india_portal_ww_data_05042020/custom_dev.csv, slash
/datasets/india_portal_ww_data_05042020/custom_test.csv, slash
/datasets/india_portal_ww_data_05222020/custom_dev.csv, slash
/datasets/india_portal_ww_data_06182020/custom_dev.csv, slash
/datasets/india_portal_ww_data_06182020/custom_test.csv
I also previously trained a model with 0.6.1 with the following command
using the same datasets for train, dev and test and keeping all the
hyper parameters same:
python3 DeepSpeech.py slash
--alphabet_config_path data/alphabet.txt slash
--beam_width 32 slash
--checkpoint_dir $ckpt_dir slash
--export_dir $ckpt_dir slash
--lm_binary_path $lm_path/lm.binary slash
--lm_trie_path $lm_path/trie slash
--n_hidden 128 slash
--learning_rate 0.0001 slash
--lm_alpha 0.75 slash
--lm_beta 1.85 slash
--train_batch_size 6 slash
--dev_batch_size 6 slash
--test_batch_size 4 slash
--report_count 10 slash
--epochs 500 slash
--noearly_stop slash
--noshow_progressbar slash
--export_tflite slash
--dev_files /datasets/deepspeech_wakeword_dataset/wakeword-dev.csv, slash
/datasets/india_portal_2may2019-dev.csv, slash
/datasets/india_portal_2to9may2019-dev.csv, slash
/datasets/india_portal_9to19may2019-dev.csv, slash
/datasets/india_portal_19to24may2019-dev.csv, slash
/datasets/brazil_portal_20to26june2019-wakeword-dev.csv, slash
/datasets/brazil_portal_26juneto3july2019-wakeword-dev.csv, slash
/datasets/mixed_portal_backups_14_16_17_18_19_visteon_wakeword_dataset-dev.csv, slash
/datasets/alexa-dev.csv, slash
/datasets/india_portal_ww_data_augmented_04282020/custom_dev.csv, slash
/datasets/india_portal_ww_data_augmented_05042020/custom_dev.csv, slash
/datasets/india_portal_ww_data_05222020/custom_dev.csv, slash
/datasets/ww_gtts_data_google_siri/custom_dev.csv, slash
/datasets/ww_polly_data_google_siri/custom_dev.csv, slash
/datasets/india_portal_ww_data_augmented_04282020/custom_dev.csv, slash
/datasets/india_portal_ww_data_augmented_05042020/custom_dev.csv slash
--test_files /datasets/alexa-sns.csv, slash
/datasets/india_portal_ww_data_04282020/custom_dev.csv, slash
/datasets/india_portal_ww_data_04282020/custom_test.csv, slash
/datasets/india_portal_ww_data_05042020/custom_dev.csv, slash
/datasets/india_portal_ww_data_05042020/custom_test.csv, slash
/datasets/india_portal_ww_data_05222020/custom_dev.csv, slash
/datasets/india_portal_ww_data_06182020/custom_dev.csv, slash
/datasets/india_portal_ww_data_06182020/custom_test.csv
But, the average WER on all these datasets is 21.26% for 0.6.1 and
44.41% for 0.9.3. The text corpus used for LM and scorer was the same in
both the cases
[This is an archived TTS discussion thread from discourse.mozilla.org/t/performance-with-version-0-9-3-is-a-lot-worse-than-version-0-6-1]
Beta Was this translation helpful? Give feedback.
All reactions