diff --git a/Gaussian_Processes_colab.ipynb b/Gaussian_Processes_colab.ipynb
index be35246..44ecb7e 100644
--- a/Gaussian_Processes_colab.ipynb
+++ b/Gaussian_Processes_colab.ipynb
@@ -46,7 +46,7 @@
         "\n",
         "<h1> Gaussian Processes </h1>\n",
         "\n",
-        "This tutorial is based on work done by Ollie Pollard on using Gaussian Processes to predict sea-level rise. Following the steps outlined in this [visual article](https://distill.pub/2019/visual-exploration-gaussian-processes/). Gaussian processes are often used to make predictions about our data by incorporating prior knowledge to fit a function to a data set. For a given set of training points, there are potentially infinitely many functions that fit the data. Gaussian processes offer an elegant solution to this problem by assigning a probability to each of these functions. The mean of this probability distribution then represents the most probable characterization of the data.\n",
+        "This tutorial is based on work done by Ollie Pollard, using Gaussian Processes to predict sea-level rise. We follow the steps outlined in this [visual article](https://distill.pub/2019/visual-exploration-gaussian-processes/). Gaussian processes are often used to make predictions on new data by incorporating prior knowledge to fit a function to a data set. For a given set of training points, there are potentially infinitely many functions that fit the data. Gaussian processes offer an elegant solution to this problem by assigning a probability to each of these functions. The mean of this probability distribution then represents the most probable characterization of the data.\n",
         "\n",
         "\n",
         "    \n",
@@ -78,9 +78,9 @@
         "                <img src=\"https://github.com/cemac/LIFD_GaussianProcesses/blob/main/animations/post_sample.gif?raw=1\">\n",
         "</a>\n",
         "    \n",
-        "This tutorial is mainly focusing on using python to explore Gaussian Processes. Please read the full [visual aritcle](https://distill.pub/2019/visual-exploration-gaussian-processes/) for a more in-depth explanation.\n",
+        "This tutorial is mainly focusing on using Python to explore Gaussian Processes. Please read the full [visual article](https://distill.pub/2019/visual-exploration-gaussian-processes/) for a more in-depth explanation.\n",
         "    \n",
-        "The [gaussian distribution](https://en.wikipedia.org/wiki/Normal_distribution) forms the building blocks of Gaussian Processes.\n",
+        "The [gaussian distribution](https://en.wikipedia.org/wiki/Normal_distribution) forms the main building block of Gaussian Processes:\n",
         "   $\\displaystyle f(x) ={\\frac {1}{\\sigma {\\sqrt {2\\pi }}}}e^{-{\\frac {1}{2}}\\left({\\frac {x-\\mu }{\\sigma }}\\right)^{2}}$\n",
         "\n",
         "<a href=\"https://distill.pub/2019/visual-exploration-gaussian-processes/\">\n",
@@ -91,7 +91,7 @@
         "    \n",
         "\\begin{equation} X = \\begin{bmatrix} X_1 \\\\ X_2 \\\\ \\vdots \\\\ X_n \\end{bmatrix} \\sim \\mathcal{N}(\\mu, \\Sigma) \\end{equation}\n",
         "\n",
-        "The covariance matrix $\\Sigma$ describes the shape of the distribution. It is defined in terms of the expected value $E$\n",
+        "The covariance matrix $\\Sigma$ describes the shape of the distribution. It can be defined using the expectation operator $E$\n",
         "    \n",
         "\\begin{equation}    \n",
         "\\Sigma = \\text{Cov}(X_i, X_j) = E \\left[ (X_i - \\mu_i)(X_j - \\mu_j)^T \\right]\n",
@@ -101,7 +101,7 @@
         "Marginalization and conditioning both work on subsets of the original distribution and we will use the following notation:\n",
         "\n",
         "\\begin{equation}\n",
-        "P_{X,Y} = \\begin{bmatrix} X \\\\ Y \\end{bmatrix} \\sim \\mathcal{N}(\\mu, \\Sigma) = \\mathcal{N} \\left( \\begin{bmatrix} \\mu_X \\\\ \\mu_Y \\end{bmatrix}, \\begin{bmatrix} \\Sigma_{XX} \\, \\Sigma_{XY} \\\\ \\Sigma_{YX} \\, \\Sigma_{YY} \\end{bmatrix} \\right)   \n",
+        "\\begin{bmatrix} X \\\\ Y \\end{bmatrix} \\sim \\mathcal{N}(\\mu, \\Sigma) = \\mathcal{N} \\left( \\begin{bmatrix} \\mu_X \\\\ \\mu_Y \\end{bmatrix}, \\begin{bmatrix} \\Sigma_{XX} \\, \\Sigma_{XY} \\\\ \\Sigma_{YX} \\, \\Sigma_{YY} \\end{bmatrix} \\right)   \n",
         "\\end{equation}\n",
         "\n",
         "With $X$ and $Y$ representing subsets of original random variables.   \n",
@@ -152,11 +152,11 @@
         "\n",
         "## Tensorflow and GPflow\n",
         "    \n",
-        "There are many machine learning python libraries available, [TensorFlow](https://www.tensorflow.org/) a is one such library. Throughout this tutorial, you will see some complex machine learning tasks executed in just a few lines of code by calling [GPflow](https://gpflow.readthedocs.io/en/master/) functions which use Tensor flow. If you have GPUs on the machine you are using, these python libraries will automatically use them and run the code even faster!\n",
+        "There are many machine learning python libraries available, [TensorFlow](https://www.tensorflow.org/) a is one such library. Throughout this tutorial, you will see some complex machine learning tasks executed in just a few lines of code by calling [GPflow](https://gpflow.readthedocs.io/en/master/) functions which use TensorFlow. If you have GPUs on the machine you are using, these python libraries will automatically use them and run the code even faster!\n",
         "\n",
         "## Further Reading\n",
         "    \n",
-        "* [GPflow example Notebooks](https://gpflow.readthedocs.io/en/develop/notebooks_file.html)\n",
+        "* [GPflow example notebooks](https://gpflow.readthedocs.io/en/develop/notebooks_file.html)\n",
         "\n",
         "</div>\n",
         "    \n",
@@ -178,8 +178,8 @@
         "\n",
         "<h2> Python Packages: </h2>\n",
         "\n",
-        "* Python 3.8\n",
-        "* tensorflow > 2.1\n",
+        "* Python 3.11\n",
+        "* tensorflow > 2.3\n",
         "* gpflow 2.1 *(must be installed via pip to get latest version)*\n",
         "* numpy\n",
         "* matplotlib\n",
@@ -190,7 +190,7 @@
         "\n",
         "<h2> Data Requirements</h2>\n",
         "    \n",
-        "This notebook refers to some data included in the git hub repository in the [data](data) folder\n",
+        "This notebook refers to some data included in the git hub repository in the [data](data) folder.\n",
         "    \n",
         "</div>"
       ]
@@ -205,7 +205,7 @@
         "\n",
         "**Contents:**\n",
         "\n",
-        "1. [Overview of Gaussian Processes](#Overview-of-Gaussian-Processes)\n",
+        "1. [Overview of Gaussian Processes](#Overview-of-Gausian-Processes)\n",
         "2. [Sea Level Example](#Sea-Level-Example)\n",
         "1. [Load Data](#Load-Data)\n",
         "2. [Normalise Data](#Normalise-Data)\n",
@@ -226,7 +226,7 @@
       "source": [
         "<div style=\"background-color: #cce5ff; padding: 10px;\">\n",
         "    \n",
-        "Load in all required modules and turn off warnings. If you have no [GPU's](https://www.analyticsvidhya.com/blog/2020/09/why-gpus-are-more-suited-for-deep-learning/) available you may see some tensor flow warnings\n",
+        "Load in all required modules and turn off warnings. If you have no [GPU](https://www.analyticsvidhya.com/blog/2020/09/why-gpus-are-more-suited-for-deep-learning/) available, you may see some TensorFlow warnings.\n",
         "\n",
         "</div>\n"
       ]
@@ -251,9 +251,9 @@
         "pip install gpflow"
       ],
       "metadata": {
-        "id": "w4mqZ8EWbKUV"
+        "id": "qu4lLNxlRndl"
       },
-      "id": "w4mqZ8EWbKUV",
+      "id": "qu4lLNxlRndl",
       "execution_count": null,
       "outputs": []
     },
@@ -272,13 +272,12 @@
         "import numpy as np\n",
         "import matplotlib.pyplot as plt\n",
         "import plotly.express as px\n",
-        "\n",
         "from gpflow.utilities import print_summary\n",
         "from scipy.spatial.distance import cdist\n",
         "from scipy.stats import multivariate_normal\n",
         "\n",
         "gpflow.config.set_default_summary_fmt(\"notebook\")\n",
-        "plt.style.use(\"seaborn-whitegrid\")"
+        "plt.style.use(\"seaborn-v0_8-whitegrid\")"
       ]
     },
     {
@@ -290,7 +289,7 @@
       "source": [
         "<div style=\"background-color: #cce5ff; padding: 10px;\">\n",
         "\n",
-        "The next cell checks for GPUS. If the result is not what you expect check your python installation.\n",
+        "The next cell checks for GPUs. If the result is not what you expect, check your python installation.\n",
         "</div>"
       ]
     },
@@ -321,9 +320,9 @@
         "\n",
         "<div style=\"background-color: #ccffcc; padding: 10px;\">\n",
         "\n",
-        "Say we want to find an unknown function `y` where $y = x^3 - 9x + e^{(x^3/30)}$ from a random sample of points [`x_samp`, `ysamp`]\n",
+        "Say we want to find an unknown function `f` where $f(x) = x^3 - 9x + e^{(x^3/30)}$ from a random sample of points [`x_samp`, `y_samp`]\n",
         "    \n",
-        "**NB function `y` is chosen at random for demonstrative purposes you can change `y` to whatever you like and see similar results!**\n",
+        "**NB function `f(x)` is chosen arbitrarily for this demonstration. You can change `f` to whatever you like and see similar results!**\n",
         "    \n",
         "\n",
         "</div>"
@@ -338,7 +337,7 @@
       "source": [
         "<div style=\"background-color: #ccffcc; padding: 10px;\">\n",
         "\n",
-        "After you've run through the code you might want to re-run with a different choice of `y`\n",
+        "After you've run through the code you might want to re-run with a different choice of `f`.\n",
         "    \n",
         "</div>"
       ]
@@ -352,10 +351,10 @@
       },
       "outputs": [],
       "source": [
-        "# some function y that we're going to try and find\n",
+        "# simulated data from some function that we're going to try and find\n",
         "# set x to between -6 and 6\n",
         "x = np.linspace(-6,6,100)\n",
-        "# A random function chosen to demonstrate Gaussian Processes\n",
+        "# y = f(x) chosen to demonstrate Gaussian Processes\n",
         "y = x**3 - 9*x + np.exp(x**3/30)"
       ]
     },
@@ -379,7 +378,7 @@
       "outputs": [],
       "source": [
         "# a random sample of points\n",
-        "# whole x range is between -6 and 6 and we'll sample x between -2 and 2\n",
+        "# whole x range is between -6 and 6 and we'll sample x between -3 and 3\n",
         "sample_size = 5\n",
         "\n",
         "samp_index = np.random.randint((len(x)-1)/4, high=(len(x)-1)*3/4, size=sample_size)"
@@ -394,7 +393,7 @@
       "source": [
         "<div style=\"background-color: #ccffcc; padding: 10px;\">\n",
         "\n",
-        "Later if you wish you can uncomment the below cell to see the effects of changing the sample size or sample area (below is set to sample the whole range with a sample size of 8)\n",
+        "Later, if you wish, you can uncomment the below cell to see the effects of changing the sample size or sample area (below is set to sample the whole range with a sample size of 8).\n",
         "    \n",
         "</div>"
       ]
@@ -421,8 +420,8 @@
       "source": [
         "<div style=\"background-color: #cce5ff; padding: 10px;\">\n",
         "\n",
-        "As we've taken a random sample there's a small chance we might have sampled the same point twice so the below code is going to check we have 5 unique sample points     \n",
-        "    \n",
+        "As we've taken a random sample there's a small chance we might have sampled the same point twice so the below code is going to check we have 5 unique sample points.\n",
+        "\n",
         "</div>"
       ]
     },
@@ -485,17 +484,17 @@
       "source": [
         "<div style=\"background-color: #ccffcc; padding: 10px;\">\n",
         "\n",
-        "From these few points, it's not obvious what function `y` could possibly be. So we can use Gaussian Processes to help work out the unknown function from these few points\n",
+        "From these few points, it's not obvious what function `f` could possibly be. So we can use Gaussian Processes to help work out the unknown function from these few points.\n",
         "\n",
-        "The covariance matrix $\\Sigma$ is determined by its covariance function $k$, which is often also called the kernel of the Gaussian process. Here we will use the [Radial Basis Function Kernel](https://towardsdatascience.com/radial-basis-function-rbf-kernel-the-go-to-kernel-acf0d22c798a):\n",
+        "The covariance matrix $\\Sigma$ is determined by its covariance function $K$, which is often also called the kernel of the Gaussian process. Here we will use the [Radial Basis Function Kernel](https://towardsdatascience.com/radial-basis-function-rbf-kernel-the-go-to-kernel-acf0d22c798a):\n",
         "    \n",
         "\\begin{equation}\n",
-        " K(X_1,X_2) = exp(-{\\frac{||X_1 - X_2|| ^2}{2\\sigma^2}})\n",
+        " K(X_1,X_2) = \\sigma^2 \\exp \\left( -\\dfrac{||X_1 - X_2||^2}{2 l^2} \\right)\n",
         "\\end{equation}\n",
         "\n",
-        "where $\\sigma$ is the variance and hyperparameter and $||X_1 - X_2|| ^2$ is the squared Euclidean distance between two points\n",
+        "where $\\sigma^2$ and $l$ are the variance and lengthscale hyperparameters respectively and $||X_1 - X_2|| ^2$ is the squared Euclidean distance between two points.\n",
         "    \n",
-        "this is defined in the function `rbf_kernel`\n",
+        "This covariance function is defined in the function `rbf_kernel`.\n",
         "    \n",
         "</div>"
       ]
@@ -512,14 +511,14 @@
         "# radial basis function kernel\n",
         "def rbf_kernel(x1, x2, var, lscale):\n",
         "    \"\"\"\n",
-        "        Compute the Euclidean distance between each row of X and X2, or between\n",
-        "        each pair of rows of X if X2 is None and feed it to the kernel.\n",
-        "     \"\"\"\n",
+        "        Compute the Euclidean distance between each row of X1 and X2, or between\n",
+        "        each pair of rows of X1 if X2 is None and feed it to the kernel.\n",
+        "    \"\"\"\n",
         "    if x2 is None:\n",
         "        d = cdist(x1, x1)\n",
         "    else:\n",
         "        d = cdist(x1, x2)\n",
-        "    K = var*np.exp(-np.power(d,2)/(lscale**2))\n",
+        "    K = var*np.exp(-np.power(d,2)/(2*lscale**2))\n",
         "    return K"
       ]
     },
@@ -532,7 +531,7 @@
       "source": [
         "<div style=\"background-color: #ccffcc; padding: 10px;\">\n",
         "\n",
-        "so `K` can be obtained via `rbf_kernel` and $\\mu$ `mu` is often assumed to be zero as a starting point\n",
+        "So `K` can be obtained via `rbf_kernel` and $\\mu$ `mu` is often assumed to be zero as a starting point.\n",
         "</div>"
       ]
     },
@@ -545,11 +544,10 @@
       "source": [
         "<div style=\"background-color: #cce5ff; padding: 10px;\">\n",
         "\n",
-        "try adjusting `lscale` to see what happens with the results\n",
-        "e.g. after running through the first time set `lscale=5` and run through all cells again\n",
-        "    \n",
-        "Increasing the length parameter increases the banding, as points further away from each other become more correlated.\n",
-        "    \n",
+        "Try adjusting `lscale` to see what happens with the results. E.g., after running through the first time, set `lscale=5` and run through all cells again.\n",
+        "\n",
+        "Increasing the lengthscale parameter increases the banding, as points further away from each other become more correlated.\n",
+        "\n",
         "</div>"
       ]
     },
@@ -609,7 +607,7 @@
         "\n",
         "Stochastic processes, such as Gaussian processes, are essentially a set of random variables. In addition, each of these random variables has a corresponding index $i$. We will use this index to refer to the $i$-th dimension of our nnn-dimensional multivariate distributions.\n",
         "    \n",
-        "Below, we have a two-dimensional normal distribution. Each dimension $y_i$ is assigned an index $\\displaystyle{i \\in {1,2}}$.  This representation allows us to understand the connection between the covariance and the resulting values: the underlying Gaussian distribution has a positive covariance between $y_1$ and $y_2$ -  this means that $y_2$ will increases as $y_1$ gets larger and vice versa.\n",
+        "Below, we have a two-dimensional normal distribution. Each dimension $y_i$ is assigned an index $\\displaystyle{i \\in {1,2}}$.  This representation allows us to understand the connection between the covariance and the resulting values: the underlying Gaussian distribution has a positive covariance between $y_1$ and $y_2$ -  this means that $y_2$ will increase as $y_1$ gets larger and vice versa.\n",
         "    \n",
         "</div>\n"
       ]
@@ -650,7 +648,7 @@
       "source": [
         "<div style=\"background-color: #ccffcc; padding: 10px;\">\n",
         "    \n",
-        "The more horizontal the lines are the more strongly correlated\n",
+        "The stronger the correlation, the more horizontal the lines will be.\n",
         "</div>"
       ]
     },
@@ -679,7 +677,7 @@
         "## Prior distribution\n",
         "\n",
         "\n",
-        "The following figure shows samples of potential functions from prior distributions (the case where we have not yet observed any training data) that were created using RBF kernel\n",
+        "The following figure shows samples of potential functions from our prior distribution (the case where we have not yet observed any training data) that were created using the RBF kernel.\n",
         "    \n",
         "</div>"
       ]
@@ -719,7 +717,7 @@
         "K_rbf = rbf_kernel(x.reshape(-1,1), None, 1.0, lscale)\n",
         "# mu = 0\n",
         "mu = np.zeros(x.shape[0])\n",
-        "# fuctions from rbf kernel\n",
+        "# functions from rbf kernel\n",
         "f_rbf = np.random.multivariate_normal(mu, K_rbf, 100)"
       ]
     },
@@ -753,7 +751,7 @@
       "source": [
         "<div style=\"background-color: #ccffcc; padding: 10px;\">\n",
         "    \n",
-        "as $\\mu$ is set to zero all functions are distributed normally around the mean $\\mu$ (0)\n",
+        "as the mean $\\mu$ is set to zero, all functions are distributed normally around zero.\n",
         "</div>"
       ]
     },
@@ -770,8 +768,10 @@
         "\n",
         "Now we're going to activate training data which we can add back into our distribution to give the posterior distribution (where we have incorporated the training data into our model).\n",
         "\n",
-        "1. First, we form the joint distribution between all x points `x` and the training points `x_samp` which gives `k_starX`\n",
-        "2. we can the use `k_xx` (covariance matrix of test x points) `k_starstar` covariance matrix of all x to calculate $\\mu_{pred}$ and $K_{pred}$\n",
+        "Let $\\mathbf{K}$ denote the prior covariance matrix of the training points derived from our RBF kernel. We define $\\mathbf{K}_*$ as the prior cross-covariance matrix of our training points and the points at which we wish to make predictions. Finally, let $\\mathbf{K}_{*,*}$ be the prior covariance matrix of the points at which to make predictions.\n",
+        "\n",
+        "1. First, we form the joint distribution of all the x points `x` and the training points `x_samp` which gives cross-covariance matrix `k_starX` .\n",
+        "2. We can the use `k_xx` (covariance matrix of training points) `k_starstar` covariance matrix of all x to calculate $\\boldsymbol \\mu_{\\text{pred}}$ and $\\mathbf{K}_{\\text{pred}}$\n",
         "    \n",
         "\\begin{equation}\n",
         "\\boldsymbol \\mu_{\\text{pred}} = \\mathbf{K}_*^\\top \\left[\\mathbf{K} + \\sigma^2 \\mathbf{I}\\right]^{-1} \\mathbf{y}\n",
@@ -793,10 +793,13 @@
       },
       "outputs": [],
       "source": [
+        "# cross-covariance matrix\n",
         "k_starX = rbf_kernel(x.reshape(-1,1), x_samp.reshape(-1,1), 3, lscale)\n",
         "\n",
-        "# K from sample point\n",
+        "# prior covariance matrix of training points\n",
         "k_xx = rbf_kernel(x_samp.reshape(-1,1), None, 3, lscale)\n",
+        "\n",
+        "# prior covariance matrix of test points\n",
         "k_starstar = rbf_kernel(x.reshape(-1,1), None, 3, lscale)"
       ]
     },
@@ -809,9 +812,9 @@
       },
       "outputs": [],
       "source": [
-        "print('no training data multivariate Gaussian distribution shape = ' +str(k_starstar.shape))\n",
-        "print('sample-whole multivariate Gaussian distribution shape = ' +str(k_starX.shape))\n",
-        "print('whole multivariate Gaussian distribution shape = ' +str(k_starX.shape))"
+        "print('test points prior covariance matrix shape = ' +str(k_starstar.shape))\n",
+        "print('training-test point prior cross-covariance matrix shape = ' +str(k_starX.shape))\n",
+        "print('training points prior covariance matrix shape = ' +str(k_xx.shape))"
       ]
     },
     {
@@ -830,6 +833,18 @@
         "    print('ERROR PLEASE RETURN TO GENERATE SAMPLE')"
       ]
     },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "id": "65afd2d7-8175-4707-8791-f16bd555550c",
+      "metadata": {
+        "id": "65afd2d7-8175-4707-8791-f16bd555550c"
+      },
+      "outputs": [],
+      "source": [
+        "print('test points posterior covariance matrix shape = ' +str(var.shape))"
+      ]
+    },
     {
       "cell_type": "markdown",
       "id": "b65b08da",
@@ -839,7 +854,7 @@
       "source": [
         "<div style=\"background-color: #cce5ff; padding: 10px;\">\n",
         "\n",
-        "If you get an error here please return to [generate sample](#generate-sample) section to check for duplicated sample points\n",
+        "If you get an error here please return to [generate sample](#generate-sample) section to check for duplicated sample points.\n",
         "\n",
         "</div>"
       ]
@@ -852,7 +867,8 @@
       },
       "source": [
         "<div style=\"background-color: #ccffcc; padding: 10px;\">\n",
-        "In the constrained covariance matrix `var`, we can see that the correlation of neighbouring points is affected by the training data. If a predicted point lies on the training data, there is no correlation with other points. Therefore, the function must pass directly through it. Predicted values further away are also affected by the training data: proportional to their distance.\n",
+        "    \n",
+        "In the constrained (posterior) covariance matrix `var`, we can see that the correlation of neighbouring points is affected by the training data. If a predicted point lies on the training data, there is no correlation with other points. Therefore, the function must pass directly through it. Predicted values further away are also affected by the training data to an extent which decreases with distance.\n",
         "</div>"
       ]
     },
@@ -865,9 +881,11 @@
       },
       "outputs": [],
       "source": [
+        "# heatmap of test points posterior covariance matrix\n",
         "fig, ax = plt.subplots()\n",
         "ax.imshow(var, cmap=\"BuPu\", interpolation='None')\n",
-        "ax.axis(False)"
+        "ax.axis(False)\n",
+        "ax.set_title(\"Posterior covariance matrix\")"
       ]
     },
     {
@@ -894,7 +912,7 @@
       "source": [
         "<div style=\"background-color: #ccffcc; padding: 10px;\">\n",
         "\n",
-        "Now we have our function predictions `f_star`that intercepts all training points\n",
+        "Now we have our function predictions `f_star` that intercept all training points.\n",
         "</div>"
       ]
     },
@@ -929,7 +947,7 @@
       "source": [
         "<div style=\"background-color: #ccffcc; padding: 10px;\">\n",
         "\n",
-        "We can plot below the mean prediction function and the standard deviation. Away from training points, the standard deviation is much higher, reflecting the lack of knowledge in these areas.\n",
+        "We can plot below the mean prediction function and the standard deviation. Away from training points, the standard deviation is much higher, reflecting the lack of knowledge in these regions of the x axis.\n",
         "</div>"
       ]
     },
@@ -952,7 +970,7 @@
         "ax.set_ylim([-10,15])\n",
         "ax.set_xlabel(\"x\")\n",
         "ax.set_ylabel(\"f(x)\")\n",
-        "ax.legend(['actual','predicted'])"
+        "ax.legend(['training','actual','predicted'])"
       ]
     },
     {
@@ -963,7 +981,7 @@
       },
       "source": [
         "<div style=\"background-color: #ccffcc; padding: 10px;\">\n",
-        "Since the RBF kernel is stationary it will always return to $\\mu=0$ in regions further away from observed training data. This decreases the accuracy of predictions that reach further into the past or the future.\n",
+        "Since the RBF kernel is stationary, the posterior mean function will always return to $\\mu=0$ in regions further away from observed training data. In our example, this results in decreased accuracy of predictions as x tends to $\\pm \\infty$. This figure indicates that our prior uncertainty was too low, as the true function lies many standard deviations away from the prior mean (zero). Not much can be done about this in practice: it is essentially a consequence of trying to empirically model a rapidly diverging function without using domain-specific knowledge.\n",
         "</div>"
       ]
     },
@@ -978,9 +996,9 @@
         "\n",
         "<div style=\"background-color: #ccffcc; padding: 10px;\">\n",
         "\n",
-        "Now let's look at an example using Gaussian Process to predict sea-level change using a subset of sea level data from the RISeR dataset provided in python NumPy arrays by Oliver Pollard.    \n",
+        "Now let's look at an example using Gaussian Processes to predict sea-level change using a subset of sea-level data from the RISeR dataset, provided in Python NumPy arrays by Oliver Pollard.    \n",
         "\n",
-        "Example of a global RSL change output for the Last Glacial Maximum to present (using the ICE-5G (Peltier et al., 2015) input to an implemented model (Han and Gomez, 2018) of sea-level theory (Kendall et al., 2005)).\n",
+        "Example of a global Relative Sea Level (RSL) change output for the Last Glacial Maximum to present (using the ICE-5G (Peltier et al., 2015) input to an implemented model (Han and Gomez, 2018) of sea-level theory (Kendall et al., 2005)).\n",
         "    \n",
         "<img src=\"https://github.com/cemac/LIFD_GaussianProcesses/blob/main/images/sealevel_data.png?raw=1\">\n",
         "\n",
@@ -1013,7 +1031,7 @@
       "source": [
         "<div style=\"background-color: #cce5ff; padding: 10px;\">\n",
         "\n",
-        "For this example we're using data provided in python [.npy files](https://towardsdatascience.com/what-is-npy-files-and-why-you-should-use-them-603373c78883) containing relative sea-level change at a single point in the southern north sea at 122 (ka). `highstand` data we will use to test our predictions\n",
+        "For this example we're using data provided in python [.npy files](https://towardsdatascience.com/what-is-npy-files-and-why-you-should-use-them-603373c78883) containing relative sea-level change at a single point in the southern North Sea at 122 (ka). `highstand` data we will use to test our predictions\n",
         "    \n",
         "<img src=\"https://github.com/cemac/LIFD_GaussianProcesses/blob/main/images/sealevel_data_ts.png?raw=1\">\n",
         "  \n",
@@ -1030,14 +1048,14 @@
         "wget https://raw.githubusercontent.com/cemac/LIFD_GaussianProcesses/469cc8b75c903a5976ddf37fff300d58b853bd6a/data/predict_points.npy\n",
         "\n",
         "mkdir data/\n",
-        "mv highstand_data.npy parameter_data.npy predict_points.npy data/"
+        "mv highstand_data.npy parameter_data.npy predict_points.npy data/\n"
       ],
       "metadata": {
-        "id": "rnmTegC9XJry"
+        "id": "25esZTHtSLsE"
       },
+      "id": "25esZTHtSLsE",
       "execution_count": null,
-      "outputs": [],
-      "id": "rnmTegC9XJry"
+      "outputs": []
     },
     {
       "cell_type": "code",
@@ -1062,7 +1080,7 @@
         "    \n",
         "We can use 4 parameters provided in `parameter_data.npy` which correspond to the parameters listed below\n",
         "\n",
-        "The Rates of Interglacial Sea-level Change and Responses (RISeR) dataset covers the early interglacial period and can be used to show the relative contribution to the Penultimate Glacial Period (PPGM) Max Ice\n",
+        "The Rates of Interglacial Sea-level change and Responses (RISeR) dataset covers the early interglacial period and can be used to show the relative contribution to the Penultimate Glacial Period (PGM) Max Ice\n",
         "Volume and Last Interglacial (LIG) Highstand from 4 parameters from sediment cores\n",
         "\n",
         "<img src=\"https://github.com/cemac/LIFD_GaussianProcesses/blob/main/images/sea_level_parameters.png?raw=1\">\n",
@@ -1114,7 +1132,8 @@
         "        normalised_parameters[:,index] = (parameter + shift)/scale\n",
         "        normalisation_values.append((shift, scale))\n",
         "\n",
-        "    return normalised_parameters, normalisation_values\n"
+        "    return normalised_parameters, normalisation_values\n",
+        ""
       ]
     },
     {
@@ -1122,12 +1141,10 @@
       "execution_count": null,
       "id": "breeding-membership",
       "metadata": {
-        "scrolled": false,
         "id": "breeding-membership"
       },
       "outputs": [],
       "source": [
-        "\n",
         "# move parameter ranges down to improve results\n",
         "p1 = parameters[:,0] # Bedrock\n",
         "p2 = parameters[:,1] # Onshore Sediment\n",
@@ -1267,13 +1284,13 @@
         "\n",
         "<div style=\"background-color: #cce5ff; padding: 10px;\">\n",
         "\n",
-        "We're going to use the python library [GP flow model](https://gpflow.readthedocs.io/en/master/notebooks/basics/regression.html) to create our Gaussian process model to create a more complex model that in our previous example in fewer lines of code\n",
-        "    \n",
-        "`k = gpflow.kernels.Matern52(lengthscales=lscale)`  selects a [Matérn covariance function](https://en.wikipedia.org/wiki/Mat%C3%A9rn_covariance_function) for the [GP flow kernel](https://gpflow.readthedocs.io/en/master/notebooks/advanced/kernels.html)  \n",
+        "We're going to use the Python library [GPflow](https://www.gpflow.org/) to build a more complex Gaussian-process model than that in our previous example, and in fewer lines of code.\n",
         "\n",
-        "`m = gpflow.models.GPR(data=(X, Y), kernel=k, mean_function=None)` constructs a regression model from data points and the selected kernal.\n",
+        "`k = gpflow.kernels.Matern52(lengthscales=lscale)`  selects a [Matérn covariance function](https://en.wikipedia.org/wiki/Mat%C3%A9rn_covariance_function) for the [GPflow kernel](https://gpflow.github.io/GPflow/develop/notebooks/getting_started/kernels.html).\n",
         "\n",
-        "to inspect the chosen kernel you can run the `print_summary` command which should show you a list of hyperparameters: `variance` and `lengthscale` and will display information about those hyperparameters which will start at default values of 1 and the [transformation](https://en.wikipedia.org/wiki/Rectifier_(neural_networks)#Softplus) applied.\n",
+        "`m = gpflow.models.GPR(data=(X, Y), kernel=k, mean_function=None)` constructs a regression model from data points and the selected kernel.\n",
+        "\n",
+        "To inspect the chosen kernel, you can run the `print_summary` command, which should show you a list of hyperparameters: `variance` and `lengthscale`, and will display information about those hyperparameters (which will start at default values of 1), as well as the [transformation](https://en.wikipedia.org/wiki/Rectifier_(neural_networks)#Softplus) applied.\n",
         "\n",
         "</div>"
       ]
@@ -1316,9 +1333,9 @@
         "\n",
         "<div style=\"background-color: #cce5ff; padding: 10px;\">\n",
         "    \n",
-        "`opt = gpflow.optimizers.Scipy()` uses the Scipy optimizer, which by default implements the [Limited Memory Broyden–Fletcher–Goldfarb–Shanno (L-BFGS-B)](https://en.wikipedia.org/wiki/Limited-memory_BFGS) algorithm\n",
+        "`opt = gpflow.optimizers.Scipy()` uses the Scipy optimizer, which by default implements the [Limited-memory Broyden–Fletcher–Goldfarb–Shanno (L-BFGS-B)](https://en.wikipedia.org/wiki/Limited-memory_BFGS) algorithm.\n",
         "    \n",
-        "`opt.minimize(m.training_loss, m.trainable_variables, options=dict(maxiter=100))` calls the minimize method of an optimizer which uses the training_loss defined by the GPflow model and the variables to train with  and the number of iterations\n",
+        "`opt.minimize(m.training_loss, m.trainable_variables, options=dict(maxiter=100))` calls the minimize method of an optimizer which uses the training loss function defined by the GPflow model, the variables to train, and the number of training iterations.\n",
         "\n",
         "</div>"
       ]
@@ -1349,7 +1366,7 @@
         "\n",
         "<div style=\"background-color: #cce5ff; padding: 10px;\">\n",
         "    \n",
-        "`m.predict_f` predicts at the new points producing a mean and variance we'll create an array of predicted means `predict` and variance `predict_var` by looping over the normalised training data from our parameters     \n",
+        "`m.predict_f` make predictions at the new points, comprising a predictive mean and variance. We'll create an array of predicted means `predict` and variances `predict_var` by looping over the normalised training data from our parameters.     \n",
         "</div>"
       ]
     },
@@ -1430,7 +1447,7 @@
         "\n",
         "<div style=\"background-color: #cce5ff; padding: 10px;\">\n",
         "    \n",
-        "We can now plot our predicted values (`predict`) against our actual values (`highstand`) *normalised*\n",
+        "We can now plot our predicted values (`predict`) against our actual values (`highstand`) *normalised*.\n",
         "</div>"
       ]
     },
@@ -1462,7 +1479,7 @@
         "\n",
         "<div style=\"background-color: #cce5ff; padding: 10px;\">\n",
         "    \n",
-        "We can now plot the actual predictions (`mean`) in purple with the grey dots agreement with the actual data (`highstand_norm`)   \n",
+        "We can now plot the predictions (`mean`) in purple with the grey dots agreement with the actual data (`highstand_norm`)   \n",
         "</div>"
       ]
     },
@@ -1490,18 +1507,18 @@
     },
     {
       "cell_type": "code",
-      "execution_count": null,
-      "id": "aa9f0dd6",
+      "source": [],
       "metadata": {
-        "id": "aa9f0dd6"
+        "id": "ItcbcwvsSUhN"
       },
-      "outputs": [],
-      "source": []
+      "id": "ItcbcwvsSUhN",
+      "execution_count": null,
+      "outputs": []
     }
   ],
   "metadata": {
     "kernelspec": {
-      "display_name": "Python 3",
+      "display_name": "Python 3 (ipykernel)",
       "language": "python",
       "name": "python3"
     },
@@ -1515,7 +1532,7 @@
       "name": "python",
       "nbconvert_exporter": "python",
       "pygments_lexer": "ipython3",
-      "version": "3.8.10"
+      "version": "3.11.7"
     },
     "colab": {
       "provenance": [],