Replies: 3 comments
-
>>> kdavis |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
>>> kdavis |
Beta Was this translation helpful? Give feedback.
-
>>> kdavis |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
>>> kdavis |
Beta Was this translation helpful? Give feedback.
-
>>> sanjay.pandey
[April 25, 2019, 5:50am]
Can you give me any source or paper or link which explain mozilla
Deepspeech fully i have already gone through your WER slash < 10 blog but i
want more detail as how acoustic model and language model works.
I want to make speech recognition for restaurant domain in which mainly
i need my model to understand every menu items be it indian or
continental or any dish and also want my model to understand phone
number and basically our customer will be of indian accent.
Things i have done until now
1. Trained further deepspeech 0.4.1 on mozilla common voice english
train-valid-dataset. Final loss which i got after training on 35
epoch was 0.8 and when i did inference after including only vocab
which consisted on train-valid-dataset in language model and spoke
the same word which consisted in language model it gives awesome
result even in noise so i tried making custom language model where i
included different food items in language model and also number
'zero to nine' per line. slash
When i did inference on that the result was not good for example slash
instead of 'three cheers chocolate' which i included on language
model it took it as 'three cold' when i spoke and the cold comes
from word 'cold coffee' which i included in language model.Even i
increased lm_alpha and lm_beta and beam width yet no change.
So i am thinking to train on indian accent speaking those above words
having around 300 hours of data and then including the same on language
model. slash
I want to ask will that improve the inference? or is there any other
way? or i need to go more in depth to understand it better? slash
if there is another way to improve the inference do tell me.
[This is an archived TTS discussion thread from discourse.mozilla.org/t/deepspeech-full-explaination]
Beta Was this translation helpful? Give feedback.
All reactions