Skip to content

A program that will read text from your screen, even from images. This program can read notes, subtitles, and more.

Notifications You must be signed in to change notification settings

Santabot123/ozvuchator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 

Repository files navigation

What is this?

This is a program that reads text from your screen. This program can read notes, subtitles, and more. It can also translate text before it is spoken. Here are some examples of what it looks and sounds like :

Знімок екрана 2024-04-18 124413

Short demo

video.demonstration.mp4

Demonstration of how the manual mode works

IMAGE ALT TEXT HERE

Demonstration of how the auto mode works

IMAGE ALT TEXT HERE

System Requirements.

  • OS: Windows 10/11 64-bit
  • internet connection

Installation.

There are 2 ways:

  1. exe file:
    • Choose a version (if you do not know what to choose, read the Versions chapter below)
    • Download and unzip one of these archives : Pytesseract, EasyOCR, EasyOCR+Pytesseract.
    • Then to run, go to exe.win-amd64-3.10 and run ozvuchator.exe
  2. jupyter notebook:
    • Install Anaconda
    • Create a new virtual environment in Anaconda
    • Download this repository and unzip it, or use git clone https://github.com/Santabot123/ozvuchator
    • Choose a version (if you do not know what to choose, read the Versions chapter below)
    • If you want to use Pyteeseract, download Tesseract 5.3.3 (maybe when you read this there is a newer version, but I don't know if it will work as well as 5.3.3, so it is recommended to use 5.3.3) and install it in the Tesseract-OCR folder located in the folder where you downloaded this repository.
    • Open a jupyter notebook, find the location where you unzipped this repository, and open ozvuchator.ipynb that corresponds to your version.
    • Press ⏩.
    • Wait (the first run will be long because you need to download all the necessary libraries)

Versions

There are three versions: Pytesseract, EasyOCR, and Pytesseract+EasyOCR. Here is some information about them so you can decide which one you need:

  • Pytesseract - uses Tesseract, choose this version if the text you want to listen to: contrasts with the background, looks like a scanned document or screenshot. Make sure that the area of the screen you choose does not include interface elements/window borders/icons, as they can be incorrectly recognized as orthographic characters. It also takes up the least amount of disk space. Examples of text that Pytesseract handles well:
    Знімок екрана 2024-05-02 191519 Знімок екрана 2024-05-02 192450 Знімок екрана 2024-04-27 182902

  • EasyOCR - сhoose this version of your if text doesn't contrast well with the background, text in a photo, or has some distortion due to perspective. EasyOCR can also use the Nvidia GPU instead of the CPU. Examples of text that EasyOCR handles well: Знімок екрана 2024-05-02 192758 Знімок екрана 2024-05-02 193230

  • Pytesseract+EasyOCR is essentially two previous versions combined into one, which gives you the flexibility to choose which method to use.
    Знімок екрана 2024-05-02 193531

Usage

There are 2 modes of use: manual and auto.

  • Manual mode - you press the activation button (F2 by default) and select an area, after which the text will be spoken once.
  • Auto mode - you select an area of the screen and when a new text appears in this area, it will be automatically spoken.

If you have already clicked Run and then want to change the settings, you need to close Ozvuchator and open it again.

Note

  • To read what a particular parameter does, hover over it with the cursor.
  • The longer the sentence, the longer the delay before the sound is played.
  • If you are going to use this program during the game, you may need to switch the game to windowed/borderless mode.
  • Text recognition will work well only with printed letters .

List of supported languages for EasyOCR version:

  • afrikaans : 'af'
  • albanian : 'sq'
  • arabic : 'ar'
  • bengali : 'bn'
  • bosnian : 'bs'
  • bulgarian : 'bg'
  • croatian : 'hr'
  • czech : 'cs'
  • danish : 'da'
  • dutch : 'nl'
  • english : 'en'
  • estonian : 'et'
  • filipino : 'tl'
  • french : 'fr'
  • german : 'de'
  • hindi : 'hi'
  • hungarian : 'hu'
  • icelandic : 'is'
  • indonesian : 'id'
  • italian : 'it'
  • japanese : 'ja'
  • kannada : 'kn'
  • korean : 'ko'
  • latin : 'la'
  • latvian : 'lv'
  • malay : 'ms'
  • marathi : 'mr'
  • nepali : 'ne'
  • norwegian : 'no'
  • polish : 'pl'
  • portuguese : 'pt'
  • romanian : 'ro'
  • slovak : 'sk'
  • spanish : 'es'
  • swahili : 'sw'
  • swedish : 'sv'
  • tamil : 'ta'
  • telugu : 'te'
  • thai : 'th'
  • turkish : 'tr'
  • ukrainian : 'uk'
  • urdu : 'ur'
  • vietnamese : 'vi'

List of supported languages for Pytesseract version:

  • Afrikaans : af
  • Albanian : sq
  • Arabic : ar
  • Bengali : bn
  • Bosnian : bs
  • Bulgarian : bg
  • Burmese : my
  • Catalan : ca
  • Croatian : hr
  • Czech : cs
  • Danish : da
  • Dutch : nl
  • English : en
  • Estonian : et
  • Finnish : fi
  • French : fr
  • German : de
  • Gujarati : gu
  • Hindi : hi
  • Hungarian : hu
  • Icelandic : is
  • Indonesian : id
  • Italian : it
  • Japanese : ja
  • Kannada : kn
  • Khmer : km
  • Korean : ko
  • Latin : la
  • Latvian : lv
  • Malay (macrolanguage) : ms
  • Malayalam : ml
  • Marathi : mr
  • Modern Greek (1453-) : el
  • Nepali (macrolanguage) : ne
  • Norwegian : no
  • Polish : pl
  • Portuguese : pt
  • Romanian : ro
  • Serbian : sr
  • Sinhala : si
  • Slovak : sk
  • Spanish : es
  • Sundanese : su
  • Swahili (macrolanguage) : sw
  • Swedish : sv
  • Tagalog : tl
  • Tamil : ta
  • Telugu : te
  • Thai : th
  • Turkish : tr
  • Ukrainian : uk
  • Urdu : ur
  • Vietnamese : vi

About

A program that will read text from your screen, even from images. This program can read notes, subtitles, and more.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published