Skip to content

Latest commit

 

History

History
29 lines (23 loc) · 1.75 KB

README.md

File metadata and controls

29 lines (23 loc) · 1.75 KB

MNIST Neural Network From Scratch

Why I Created This Notebook

I created this notebook for to learn about the basics of neural networks, driven by my interest in AI. The goal is to create a neural network from scratch which can classify handwritten digits.

Data

The dataset used for the training and evaluation can be found here, it contains a small subsets of the MNIST data set, transformed into CSV format.

Result

The v4 notebook achieved an accurary of 80%. This was achieved by using LeakyReLU instead of ReLU [1] [2] to fix the dying ReLU problem.

Network parameter Value Notes
Input layer 784 neurons
Hidden layer 200 neurons
Output layer 10 neurons Multi-class classification (10 numbers)
Hidden layer activation function LeakyReLU
Final layer activation function Softmax Used for multi-class classification
Loss function Cross-entropy
Learning rate 0.00001 This learning rate gives the least funky plot
Epochs 250
Early stopping point 92 Check graph and log
Performance 0.8

Observations

Lowering the learning rate from 0.0001 to 0.00001 (10x), did not significantly increase the convergance point. This could be because of the use of LeakyReLU, which might offer faster convergance [3]

Links

Kaggle