Skip to content

Uses VideoSubFinder and Google Cloud Vision to extract hardsubs and OCR them to create an SRT file. Main purpose is for use with MPVacious for quick dictionary lookups and use with subs2srs. All code in the .bat and .py files was written by ChatGPT.

Notifications You must be signed in to change notification settings

Furretar/Hardsub-Extract-OCR

Repository files navigation

Hardsub-Extract-OCR: Extract hardsubs and OCR with 99% accuracy

Uses VideoSubFinder and Google Cloud Vision to extract hardsubs and OCR them to create an SRT file. Main purpose is for use with MPVacious for quick dictionary lookups and use with subs2srs. All code in the .bat and .py files was written by ChatGPT. I used pyinstaller to make the python script an exe, source code is in BatchProcessor.py.

Installation (Only for Windows)

  1. Download this repository and extract anywhere.
  2. Make sure you have python installed (used version 3.13.0 here)
  3. Install VideoSubFinder and extract Release_x64 to the main directory.
    1. It's important that the folder name matches Release_x64 exactly.
/Hardsub-Extract-OCR-main/
└── /BatchProcessor/
└── /PutVideosInHere/
└── /Release_x64/ <-- here

Usage

Extraction

  • Put the videos you want to extract hardsubs from in the /PutVideosInHere/ folder.
  • Run the ExtractHardsubs.bat.
    • You will need to check the RGBImages folder so you can tweak the crop settings. Having a closer crop will reduce noise in the final result and increase processing speed in the extraction and OCR steps. You also may need to close VideoSubFinder using Task Manager (ctrl+shift+esc) before running the .bat again:
image

This is where the RGBImages folder is:

/Hardsub-Extract-OCR-main/
└── /BatchProcessor/
└── /PutVideosInHere/
   ├── /output/
       └──/VideoTitle/
          └──RGBImages
   └── VideoTitle.mp4
└── /Release_x64/

Example of too high -be value, crops too much off the bottom 0_00_16_099__0_00_20_686_0019205760896008612800720

This is how you can tweak the crop values in ExtractHardsubs.bat. You can edit the file by right clicking and selecting edit. image image image

Once your crop values are tweaked, you can let the program run. It will leave you with the final images in TXTImages. They should look like this: 0_00_16_099__0_00_20_686_1019205790896007012800720

OCR

The OCR we will be using is called Google Cloud Vision, it's the same one used in Google Lens. I say it's 99% accuracy, but I've never seen it make a mistake with high quality source images. It does pick up some noise though. At the time of writing, Google offers a $300 free trial for 3 months. That's enough to OCR over 100 episodes of anime in my experience.

  • Make a Google Cloud account.
  • Create a project.
    • You can just enter random text or "Other" for the details.
  • Enable billing, enter payment information.
    • It will not charge you unless you go over the trial amount.
  • In APIs & Services in your project, go to Credentials.
  • Click + Create Credentials, then Service Account.
  • Enter info.
  • Inside the Service Account, click Keys.
  • Click ADD KEY, Create new key, make sure JSON is selected, then click Create.
  • Drag this JSON file into the main directory.
  • In your project, go to APIs and Services.
  • Enabled APIs and Services.
  • + Enable APIs and Services.
  • Search for Cloud Vision.
  • Enable it.

Now you can use the Batch Processor.

  • Run 01 InstallRequirements.bat.
  • Run 02 BatchProcessor.exe.
  • Select your folder with images.
    • If you're processing multiple videos at once, this will be output, check the Process Multiple Folders checkbox.
    • If you're processing a single video, this will be TXTImages.
  • Select the output folder, where the SRT files will be generated.
  • Select your JSON file, should be in the main directory. Ex. example-473422-7e6ba2cacb95.json.

Once you have your subtitle files, you may want to use subtitle edit to merge lines with the same text. For example, when a scene changes in a show with the same subtitle line on screen, it gets counted as two lines, which can cause problems when using subs2srs.

  • To do this, open subtitle edit.
  • Click on Tools
  • Batch Convert
  • Drag in files
  • Merge lines with same text
  • Overwrite files
  • Convert

Links

Discord: furretar

Note

If you'd rather use a locally hosted OCR rather than Google Cloud Vision, you should consider using RapidVideOCR Desktop. It uses PaddleOCR, which has about 95% accuracy in my experience. You can also convert the TXTImages from the first step into a SUP file directly without OCR using Images to PGS SUP

About

Uses VideoSubFinder and Google Cloud Vision to extract hardsubs and OCR them to create an SRT file. Main purpose is for use with MPVacious for quick dictionary lookups and use with subs2srs. All code in the .bat and .py files was written by ChatGPT.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages