Keras implementation of Sara Sabour, Nicholas Frosst, Geoffrey E Hinton. Dynamic Routing Between Capsules. NIPS 2017.
After training for 50 epochs with Adam (default parameters) and learning rate decay by factor of 0.9 after every epoch, the best test set error was 0.34%.
The paper reported average test set error of 0.25%. Number of epochs and learning rate decay scheme were not specified in the paper.
Reconstuctions are of poor quality, possibly due to not enough epochs of training.
Training on Google Colab (Tesla K80 GPU) took around 300s per epoch.
- Investigate poor quality of digits reconstructions and improve them
- Once reconstructions are improved, visualize how changing capsule's dimensions affects reconstruction
- Add docstrings and more comments explaining what's happening in the code
- Test on other datasets (CIFAR10, smallNORB, MultiMNIST)
Keras implementation by Xifeng Guo, especially the implementation of the masking layer.