Despite the success of deep learning in many application areas, neural networks lack of predictive uncertainty estimates. Gaussian processes, as a Bayesian non-parametric model provide the uncertainty quantification and full mathematical interpretation. But scabality remains the biggest challenge in Gaussian processes. Due to matrics inversion, the complexity is
We studied the non-local generalization in shallow stucture like kernel methods.
- Multiclass classification on MNIST dataset, w/o convolutional struture
- Sparse Gaussian process to reduce the complexity
- Variantional inference (minimizing KL-divergence/maxizing ELBO)
- Optimization with Adam(1st derivate based) and L-BFGS-B(2nd derivative based) methods
tensorflow == 2.3
tensorflow_probability == 0.11.1
python == 3.8
gpflow == 2.1.4
1D regression is the easiest problem to visualize Gaussian processes, but the idea generalizes to higher dimension and multiclass classification problem.