V0 9 3 improving accuracy of numbers dollar values and calendar dates #1707

JRMeyer · 2021-03-08T08:35:51Z

JRMeyer
Mar 8, 2021
Maintainer

>>> MarcS
[January 8, 2021, 9:31pm]

Environment: Ubuntu 18.04 on x64 platform slash
Release: Deepspeech v0.9.3 slash
Python Version: 3.6.x slash
Language: US English slash
Mic: eMeet

I was able to correctly install the 0.9.3 entire release, including all
of the tools/scripts, nvidia-tensorflow-gpu, KenLM, as well as pre-built
model and scorer. I can successfully train on the Mozilla Common Voice
6.1 corpus (I just wanted to test that the install and configuration
were correct), as well as go through the process to rebuild/duplicate
the included librispeech based scorer. No errors or exceptions are being
thrown during either of these processes.

Out of the box I have an accuracy rate approaching 100% for standard
word based sentence constructs like slash 'The quick brown fox ..., Now is
the time for... , etc), and I can also speak a sequence of standard
digits (i.e. - one, two, three, four, five ...) in quick succession with
a near perfect recognition rate as well. But the accuracy for 'real
world' numbers (i.e. - one hundred thousand ) as well as dates (i.e. -
January seventh two thousand and twenty one) is far less. I would like
to improve the recognition accuracy of both real world numbers, dollar
values, as well as dates.

My first question is, has someone already developed an English language
model/scorer with improved accuracy in these two key areas, having a
non-restrictive (Apache, Creative Commons, MIT, etc. ) license? If not,
what are the steps that I would need to go through to improve accuracy
of the exiting model/scorer in these areas?

Prior to posting this note, based upon 'Newb Theory' I did try to create
a custom scorer, based upon a modified version (additional sentences for
numbers and dates were added) of the 'librispeech-lm-norm.txt' file
which I downloaded from the OpenSLR.org website.
Following the exact instructions provided on
deepspeech.readthedocs.io for
re-creating the pre-built scorer. But there was no improvement in
recognition accuracy. I also tried the same process using a brand new
text file containing just a few sentences, but there was no change to
recognition accuracy either.

Any help or guidance would be greatly appreciated.

[This is an archived TTS discussion thread from discourse.mozilla.org/t/v0-9-3-improving-accuracy-of-numbers-dollar-values-and-calendar-dates]

JRMeyer · 2021-03-08T08:35:54Z

JRMeyer
Mar 8, 2021
Maintainer Author

>>> othiele
[January 8, 2021, 10:31pm]

First, run your audio without a scorer argument to see what the acoustic
model detects in your test numbers.

Then change the custom scorer to contain a lot of combinations of the
numbers you want to detect. Simplified, the language model gives a
probability of this exact combination of spoken numbers in relation to
all text. So, ideally you have some occurences of this exact
combination, not just the numbers in general.

[Archived Post]

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

V0 9 3 improving accuracy of numbers dollar values and calendar dates #1707

{{title}}

Replies: 1 comment

{{title}}

Select a reply

V0 9 3 improving accuracy of numbers dollar values and calendar dates #1707

JRMeyer Mar 8, 2021 Maintainer

Replies: 1 comment

JRMeyer Mar 8, 2021 Maintainer Author

JRMeyer
Mar 8, 2021
Maintainer

JRMeyer
Mar 8, 2021
Maintainer Author