You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This merely post-processes some of the command-line choices. It never actually instantiates a CTCDecoderProcessor, or replaces the default one in the postprocessor pipeline.
How is this supposed to have worked in the first place?
Also, the parameterization of this postprocessor begs more questions. Assuming some tests have been done with the dictionary feature (word beam search):
Why is non_word_chars not automatically configured to all the punctuation characters in the entire charset during training? (All the public models I see merely contain the default characters, i.e. only ASCII punctuation.)
Why is word_separator only whitespace by default – shouldn't that allow more cases like hyphen (esp. in German)?
The text was updated successfully, but these errors were encountered:
calamari/calamari_ocr/scripts/predict.py
Line 114 in f0139d6
This merely post-processes some of the command-line choices. It never actually instantiates a CTCDecoderProcessor, or replaces the default one in the postprocessor pipeline.
How is this supposed to have worked in the first place?
Also, the parameterization of this postprocessor begs more questions. Assuming some tests have been done with the
dictionary
feature (word beam search):non_word_chars
not automatically configured to all the punctuation characters in the entire charset during training? (All the public models I see merely contain the default characters, i.e. only ASCII punctuation.)word_separator
only whitespace by default – shouldn't that allow more cases like hyphen (esp. in German)?The text was updated successfully, but these errors were encountered: