Skip to content

aws-samples/amazon-textract-hocr-output

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Amazon Textract to hOCR

Convert your Amazon Textract results to hOCR output.

Usage Instructions

The code necessary for transforming Amazon Textract text extraction results to hOCR output is located in code/hocrOuput.py.

To make the code work you will need to install the following packages via pip:

Inside code/hocrOuput.py, in the main function, replace the input_document_url with your document location in Amazon S3.

Run the script, it will generate an output html file.

Output example

Example

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.

About

No description, website, or topics provided.

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages