Skip to content

Image processing and neural networks to analyze and calculate hand written equations.

Notifications You must be signed in to change notification settings

pearl-natalia/MathOCR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Mathematical OCR - Equation Recognition & Calculation

Step 1: Segmenting Image into Terms & Math Operators

Resize: Scale down image to reduce computational overhead (I used width of 1000 px, height based on aspect ratio, INTER_AREA interpolation method to preserve image quality).

Threshold: Convert to greyscale and using OpenCv's threshold method (pixels above certain value become white(255), then invert img so text = white, background = black). This will format image for dilation.

Screenshot 2024-05-14 at 10 19 16 PM

Dilation: Extend white regions to merge nearby digits together, allowing machine to detect which digits belong together in a term. This gives meaning to the numbers. Pixels are dilated with a kernel (see below); to identify terms, a kernel with a large width is ideal in case digits are spaced out.

image

Contour Points: Use OpenCV's findContours() function to identify bounding edges of the terms + operators in the equation. This function will operate on the dilated image, using RETR_EXTERNAL to omit any inner contours and CHAIN_APPROX_NONE to store all points of the contour for a more accurate result. OpenCV uses Suzuki's algorithm to calculate contour points.

image

Bounding Rectangle: Use boundingRect() to identify smallest rectangle that encloses the contour points. This is to split image into smaller images of terms and operators.

image

Repeat: Repeat the process for each sub image to differentiate terms into digits, with a smaller kernel since digits are close to each other, and store values in a 3D array

image

Step 2: Neural Network to Identify Digits & Symbols

Dataset: I used a dataset of 27,000 28x28 pixel images of digits 0-9, and symbols plus, minus, dot, multiply, slash for division, & variables w, y, z. The dataset is from wblachowski, licensed under the MIT License. There will be an 80:20 train:test split.

image

Format: Convert to grayscale. Now resize the image to have height of 28 pixels, and aspect ratio accordinly (do not stretch image by forcing both dimensions to be 28 pixels!). Then append image onto center of 28x28 black canvas so input image is 28x28 pixels. Increase intensity of pixels (reducing their grayscale value if they meet a certain threshold to darken the handwriting.

image

Model: Consists of 3 layers, inner layers use Relu activation, and output layer uses softmax. Image to String: Iterate through the array of images, use the trained model to predict the digit/symbol it contains, append to a string representing the equation, and use python's built in eval() function to evalute the equation.

image

About

Image processing and neural networks to analyze and calculate hand written equations.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published