Replies: 1 comment
-
>>> othiele |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
>>> MarcS
[January 8, 2021, 9:31pm]
Environment: Ubuntu 18.04 on x64 platform slash
Release: Deepspeech v0.9.3 slash
Python Version: 3.6.x slash
Language: US English slash
Mic: eMeet
I was able to correctly install the 0.9.3 entire release, including all
of the tools/scripts, nvidia-tensorflow-gpu, KenLM, as well as pre-built
model and scorer. I can successfully train on the Mozilla Common Voice
6.1 corpus (I just wanted to test that the install and configuration
were correct), as well as go through the process to rebuild/duplicate
the included librispeech based scorer. No errors or exceptions are being
thrown during either of these processes.
Out of the box I have an accuracy rate approaching 100% for standard
word based sentence constructs like slash 'The quick brown fox ..., Now is
the time for... , etc), and I can also speak a sequence of standard
digits (i.e. - one, two, three, four, five ...) in quick succession with
a near perfect recognition rate as well. But the accuracy for 'real
world' numbers (i.e. - one hundred thousand ) as well as dates (i.e. -
January seventh two thousand and twenty one) is far less. I would like
to improve the recognition accuracy of both real world numbers, dollar
values, as well as dates.
My first question is, has someone already developed an English language
model/scorer with improved accuracy in these two key areas, having a
non-restrictive (Apache, Creative Commons, MIT, etc. ) license? If not,
what are the steps that I would need to go through to improve accuracy
of the exiting model/scorer in these areas?
Prior to posting this note, based upon 'Newb Theory' I did try to create
a custom scorer, based upon a modified version (additional sentences for
numbers and dates were added) of the 'librispeech-lm-norm.txt' file
which I downloaded from the OpenSLR.org website.
Following the exact instructions provided on
deepspeech.readthedocs.io for
re-creating the pre-built scorer. But there was no improvement in
recognition accuracy. I also tried the same process using a brand new
text file containing just a few sentences, but there was no change to
recognition accuracy either.
Any help or guidance would be greatly appreciated.
[This is an archived TTS discussion thread from discourse.mozilla.org/t/v0-9-3-improving-accuracy-of-numbers-dollar-values-and-calendar-dates]
Beta Was this translation helpful? Give feedback.
All reactions