Mathematical OCR - Equation Recognition & Calculation

Step 1: Segmenting Image into Terms & Math Operators

Resize: Scale down image to reduce computational overhead (I used width of 1000 px, height based on aspect ratio, INTER_AREA interpolation method to preserve image quality).

Threshold: Convert to greyscale and using OpenCv's threshold method (pixels above certain value become white(255), then invert img so text = white, background = black). This will format image for dilation.

Dilation: Extend white regions to merge nearby digits together, allowing machine to detect which digits belong together in a term. This gives meaning to the numbers. Pixels are dilated with a kernel (see below); to identify terms, a kernel with a large width is ideal in case digits are spaced out.

Contour Points: Use OpenCV's findContours() function to identify bounding edges of the terms + operators in the equation. This function will operate on the dilated image, using RETR_EXTERNAL to omit any inner contours and CHAIN_APPROX_NONE to store all points of the contour for a more accurate result. OpenCV uses Suzuki's algorithm to calculate contour points.

Bounding Rectangle: Use boundingRect() to identify smallest rectangle that encloses the contour points. This is to split image into smaller images of terms and operators.

Repeat: Repeat the process for each sub image to differentiate terms into digits, with a smaller kernel since digits are close to each other, and store values in a 3D array

Step 2: Neural Network to Identify Digits & Symbols

Dataset: I used a dataset of 27,000 28x28 pixel images of digits 0-9, and symbols plus, minus, dot, multiply, slash for division, & variables w, y, z. The dataset is from wblachowski, licensed under the MIT License. There will be an 80:20 train:test split.

Format: Convert to grayscale. Now resize the image to have height of 28 pixels, and aspect ratio accordinly (do not stretch image by forcing both dimensions to be 28 pixels!). Then append image onto center of 28x28 black canvas so input image is 28x28 pixels. Increase intensity of pixels (reducing their grayscale value if they meet a certain threshold to darken the handwriting.

Model: Consists of 3 layers, inner layers use Relu activation, and output layer uses softmax. Image to String: Iterate through the array of images, use the trained model to predict the digit/symbol it contains, append to a string representing the equation, and use python's built in eval() function to evalute the equation.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
OCR_from_Scratch.ipynb		OCR_from_Scratch.ipynb
README.md		README.md
equation.png		equation.png
expression.png		expression.png
symbols.zip		symbols.zip

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mathematical OCR - Equation Recognition & Calculation

Step 1: Segmenting Image into Terms & Math Operators

Step 2: Neural Network to Identify Digits & Symbols

About

Releases

Packages

Languages

pearl-natalia/MathOCR

Folders and files

Latest commit

History

Repository files navigation

Mathematical OCR - Equation Recognition & Calculation

Step 1: Segmenting Image into Terms & Math Operators

Step 2: Neural Network to Identify Digits & Symbols

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages