- Here we will train a VAE to generate handwriting number
- The dataset is MNIST, it will be downloaded under the folder
dataset
using torchvision, the dataset folder structure looks like this:
dataset
├── mnist
│ └── MNIST
│ │ └── raw
│ │ ├── t10k-images-idx3-ubyte
│ │ ├── t10k-images-idx3-ubyte.gz
│ │ ├── t10k-labels-idx1-ubyte
│ │ ├── t10k-labels-idx1-ubyte.gz
│ │ ├── train-images-idx3-ubyte
│ │ ├── train-images-idx3-ubyte.gz
│ │ ├── train-labels-idx1-ubyte
│ │ └── train-labels-idx1-ubyte.gz
- For this task, we just build a easy model which only contains fully connected layer, no convolutions
- Here I use a NVIDIA GeForce RTX 3090 to train, each epoch will cost about 3 seconds
- If you want to train from scratch, you don't have to modify anything. If you finish training and want to generate number picture, modify
mode
, simply run program and wait for your generated numbers
python run.py
- Of course, you can modify the model architecture or try some other hyper-parameters, do anything you want
- First of all, we will use random Gaussian Noise to sample some images, here are 256 examples
- Then we can see the reconstruct numbers
- I think the quality is good because we just use such a simple model, you can try the model we used in VAE_ANIME