Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make other language support more obvious #48

Open
artjomsR opened this issue Feb 22, 2023 · 2 comments
Open

Make other language support more obvious #48

artjomsR opened this issue Feb 22, 2023 · 2 comments
Assignees

Comments

@artjomsR
Copy link

artjomsR commented Feb 22, 2023

This tool is correctly advertised as working for all languages but out of the box works only with Japanese and it's not obvious how to use it for other languages. This will make the tool more accessible to all language learners. Suggested changes:

  1. Add an option to UI settings to select a language for OCR

OR

  1. Add documentation to make it more obvious how the user can do the same manually. Here's my attempt:
    In config.ini, change values according to https://tesseract-ocr.github.io/tessdoc/Data-Files-in-different-versions.html and replace them in these lines
tesseract_language = jpn
ocr_space_language = jpn

Download XYZ.traineddata for your language from https://github.com/tesseract-ocr/tessdata_best/ (OR https://github.com/tesseract-ocr/tessdata) and put it in the game2text\resources\bin\win\tesseract\tessdata folder

@mathewthe2 mathewthe2 self-assigned this Mar 13, 2023
@drewboardman
Copy link

How does one actually use it for other languages?

@artjomsR
Copy link
Author

@drewboardman You should be able to follow the instructions in my comment above (after Here's my attempt: part). This worked for me with non-Japanese language at the time of writing the comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants