Skip to content

Commit

Permalink
Created using Colaboratory
Browse files Browse the repository at this point in the history
  • Loading branch information
cemachelen committed Feb 26, 2024
1 parent c005648 commit 5781b07
Showing 1 changed file with 20 additions and 21 deletions.
41 changes: 20 additions & 21 deletions Gaussian_Processes_colab.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -46,14 +46,14 @@
"\n",
"<h1> Gaussian Processes </h1>\n",
"\n",
"This tutorial is based on work done by Ollie Pollard on using Gaussian Processes to predict sea-level rise. Following the steps outlined in this [visual article](https://distill.pub/2019/visual-exploration-gaussian-processes/). Gaussian processes are often used to make predictions about our data by incorporating prior knowledge often to fit a function to a data set. For a given set of training points, there are potentially infinitely many functions that fit the data. Gaussian processes offer an elegant solution to this problem by assigning a probability to each of these functions. The mean of this probability distribution then represents the most probable characterization of the data.\n",
"This tutorial is based on work done by Ollie Pollard on using Gaussian Processes to predict sea-level rise. Following the steps outlined in this [visual article](https://distill.pub/2019/visual-exploration-gaussian-processes/). Gaussian processes are often used to make predictions about our data by incorporating prior knowledge to fit a function to a data set. For a given set of training points, there are potentially infinitely many functions that fit the data. Gaussian processes offer an elegant solution to this problem by assigning a probability to each of these functions. The mean of this probability distribution then represents the most probable characterization of the data.\n",
"\n",
"\n",
" \n",
"## Recommended reading\n",
"\n",
"* [Overview of Linear Regression](https://towardsdatascience.com/linear-regression-detailed-view-ea73175f6e86)\n",
"* [An Intuative Guide to Gaussian Processes](https://towardsdatascience.com/an-intuitive-guide-to-gaussian-processes-ec2f0b45c71d)\n",
"* [An Intuitive Guide to Gaussian Processes](https://towardsdatascience.com/an-intuitive-guide-to-gaussian-processes-ec2f0b45c71d)\n",
" \n",
"</div>\n",
"\n"
Expand Down Expand Up @@ -205,13 +205,13 @@
"\n",
"**Contents:**\n",
"\n",
"1. [Overview of Gausian Processes](#Overview-of-Gausian-Processes)\n",
"1. [Overview of Gaussian Processes](#Overview-of-Gaussian-Processes)\n",
"2. [Sea Level Example](#Sea-Level-Example)\n",
"1. [Load Data](#Load-Data)\n",
"2. [Normalise Data](#Normalise-Data)\n",
"3. [Plot Data](#Plot-Data)\n",
"4. [Define GP flow model](#Define-GP-flow-model)\n",
"5. [Optamization](#Optamization)\n",
"5. [Optimization](#Optimization)\n",
"6. [Prediction](#Prediction)\n",
"7. [Cross Validation](#Cross-Validation)\n",
"8. [Plot High Stands](#Plot-High-Stands)\n"
Expand Down Expand Up @@ -290,7 +290,7 @@
"source": [
"<div style=\"background-color: #cce5ff; padding: 10px;\">\n",
"\n",
"The next cell check's for GPUS. If the result is not what you expect check your python installation.\n",
"The next cell checks for GPUS. If the result is not what you expect check your python installation.\n",
"</div>"
]
},
Expand Down Expand Up @@ -355,7 +355,7 @@
"# some function y that we're going to try and find\n",
"# set x to between -6 and 6\n",
"x = np.linspace(-6,6,100)\n",
"# A random function chosem\n",
"# A random function chosen to demonstrate Gaussian Processes\n",
"y = x**3 - 9*x + np.exp(x**3/30)"
]
},
Expand All @@ -366,7 +366,7 @@
"id": "05515d9e"
},
"source": [
"# generate sample"
"# Generate sample"
]
},
{
Expand Down Expand Up @@ -487,13 +487,13 @@
"\n",
"From these few points, it's not obvious what function `y` could possibly be. So we can use Gaussian Processes to help work out the unknown function from these few points\n",
"\n",
"The covariance matrix $\\Sigma$ is determined by its covariance function $k$, which is often also called the kernel of the Gaussian process. Here we will use the [Radial Basis Function Kernal](https://towardsdatascience.com/radial-basis-function-rbf-kernel-the-go-to-kernel-acf0d22c798a):\n",
"The covariance matrix $\\Sigma$ is determined by its covariance function $k$, which is often also called the kernel of the Gaussian process. Here we will use the [Radial Basis Function Kernel](https://towardsdatascience.com/radial-basis-function-rbf-kernel-the-go-to-kernel-acf0d22c798a):\n",
" \n",
"\\begin{equation}\n",
" K(X_1,X_2) = exp(-{\\frac{||X_1 - X_2|| ^2}{2\\sigma^2}})\n",
"\\end{equation}\n",
"\n",
"where $\\sigma$ is the variance and hyperparameter and $||X_1 - X_2|| ^2$ is the Euclidean distance between two points\n",
"where $\\sigma$ is the variance and hyperparameter and $||X_1 - X_2|| ^2$ is the squared Euclidean distance between two points\n",
" \n",
"this is defined in the function `rbf_kernel`\n",
" \n",
Expand Down Expand Up @@ -609,7 +609,7 @@
"\n",
"Stochastic processes, such as Gaussian processes, are essentially a set of random variables. In addition, each of these random variables has a corresponding index $i$. We will use this index to refer to the $i$-th dimension of our nnn-dimensional multivariate distributions.\n",
" \n",
"Below, we have a two-dimensional normal distribution. Each dimension $y_i$ is assigned an index $\\displaystyle{i \\in {1,2}}$. This representation allows us to understand the connection between the covariance and the resulting values: the underlying Gaussian distribution has a positive covariance between $y_1$ and $y_1$ -  this means that $y_2$ will increases as $y_1$ gets larger and vice versa.\n",
"Below, we have a two-dimensional normal distribution. Each dimension $y_i$ is assigned an index $\\displaystyle{i \\in {1,2}}$. This representation allows us to understand the connection between the covariance and the resulting values: the underlying Gaussian distribution has a positive covariance between $y_1$ and $y_2$ -  this means that $y_2$ will increases as $y_1$ gets larger and vice versa.\n",
" \n",
"</div>\n"
]
Expand Down Expand Up @@ -768,7 +768,7 @@
"\n",
"## Posterior distribution\n",
"\n",
"Now we're going to activate training data which we can add back into our distribution. To give the posterior distribution (where we have incorporated the training data into our model).\n",
"Now we're going to activate training data which we can add back into our distribution to give the posterior distribution (where we have incorporated the training data into our model).\n",
"\n",
"1. First, we form the joint distribution between all x points `x` and the training points `x_samp` which gives `k_starX`\n",
"2. we can the use `k_xx` (covariance matrix of test x points) `k_starstar` covariance matrix of all x to calculate $\\mu_{pred}$ and $K_{pred}$\n",
Expand Down Expand Up @@ -929,7 +929,7 @@
"source": [
"<div style=\"background-color: #ccffcc; padding: 10px;\">\n",
"\n",
"We can plot below the mean prediction function and the standard deviation. Away from training points, the standard deviation is much higher. Reflecting the lack of knowledge in these areas.\n",
"We can plot below the mean prediction function and the standard deviation. Away from training points, the standard deviation is much higher, reflecting the lack of knowledge in these areas.\n",
"</div>"
]
},
Expand Down Expand Up @@ -978,7 +978,7 @@
"\n",
"<div style=\"background-color: #ccffcc; padding: 10px;\">\n",
"\n",
"Now let's look at an example using Gaussian Process to Predict sea-level change using a subset of sea level data from the RISeR dataset provided in python NumPy arrays by Oliver Pollard. \n",
"Now let's look at an example using Gaussian Process to predict sea-level change using a subset of sea level data from the RISeR dataset provided in python NumPy arrays by Oliver Pollard. \n",
"\n",
"Example of a global RSL change output for the Last Glacial Maximum to present (using the ICE-5G (Peltier et al., 2015) input to an implemented model (Han and Gomez, 2018) of sea-level theory (Kendall et al., 2005)).\n",
" \n",
Expand Down Expand Up @@ -1063,7 +1063,7 @@
"We can use 4 parameters provided in `parameter_data.npy` which correspond to the parameters listed below\n",
"\n",
"The Rates of Interglacial Sea-level Change and Responses (RISeR) dataset covers the early interglacial period and can be used to show the relative contribution to the Penultimate Glacial Period (PPGM) Max Ice\n",
"Volume and Last Interglacial (LIG) Highstnad from 4 parameters from sediment cores\n",
"Volume and Last Interglacial (LIG) Highstand from 4 parameters from sediment cores\n",
"\n",
"<img src=\"https://github.com/cemac/LIFD_GaussianProcesses/blob/main/images/sea_level_parameters.png?raw=1\">\n",
"\n",
Expand Down Expand Up @@ -1114,8 +1114,7 @@
" normalised_parameters[:,index] = (parameter + shift)/scale\n",
" normalisation_values.append((shift, scale))\n",
"\n",
" return normalised_parameters, normalisation_values\n",
""
" return normalised_parameters, normalisation_values\n"
]
},
{
Expand All @@ -1132,9 +1131,9 @@
"# move parameter ranges down to improve results\n",
"p1 = parameters[:,0] # Bedrock\n",
"p2 = parameters[:,1] # Onshore Sediment\n",
"p3 = parameters[:,2] # Marine Sediement\n",
"p3 = parameters[:,2] # Marine Sediment\n",
"p4 = parameters[:,3] # Ice streams\n",
"parmlist = ['Bedrock','Onshore Sediment','Marine Sediement','Ice streams']\n",
"parmlist = ['Bedrock','Onshore Sediment','Marine Sediment','Ice streams']\n",
"parameters = stack_parameters(p1, p2, p3, p4)\n",
"highstand = highstand.reshape(-1,1)\n",
"\n",
Expand Down Expand Up @@ -1313,7 +1312,7 @@
"id": "2847d55e"
},
"source": [
"## Optamization\n",
"## Optimization\n",
"\n",
"<div style=\"background-color: #cce5ff; padding: 10px;\">\n",
" \n",
Expand Down Expand Up @@ -1431,7 +1430,7 @@
"\n",
"<div style=\"background-color: #cce5ff; padding: 10px;\">\n",
" \n",
"we can now plot our predicted values (`predict`) against our actual values (`highstand`) *normalised*\n",
"We can now plot our predicted values (`predict`) against our actual values (`highstand`) *normalised*\n",
"</div>"
]
},
Expand Down Expand Up @@ -1459,7 +1458,7 @@
"id": "794b84f8"
},
"source": [
"## Plot High Stands\n",
"## Plot Highstands\n",
"\n",
"<div style=\"background-color: #cce5ff; padding: 10px;\">\n",
" \n",
Expand Down

0 comments on commit 5781b07

Please sign in to comment.