From 4cbf3ab269deeb3fd561c9565cc3dd6c05d7b203 Mon Sep 17 00:00:00 2001
From: Pattrigue <pfas19@student.aau.dk>
Date: Tue, 28 May 2024 14:12:53 +0200
Subject: [PATCH 1/4] write about ngboost (citation needed)

---
 report_thesis/src/sections/background.tex | 35 +++++++++++++++++++++++
 1 file changed, 35 insertions(+)

diff --git a/report_thesis/src/sections/background.tex b/report_thesis/src/sections/background.tex
index ccb912e0..7278d180 100644
--- a/report_thesis/src/sections/background.tex
+++ b/report_thesis/src/sections/background.tex
@@ -357,6 +357,41 @@ \subsubsection{Gradient Boosting Regression (GBR)}\label{sec:gradientboost}
 In the context of regression, gradient boosting aims to minimize the difference between the predicted values and the actual target values by fitting successive trees to the residuals.
 To minimize errors, gradient descent is used to iteratively update model parameters in the direction of the negative gradient of the loss function, thereby following the path of steepest descent~\cite{gradientLossFunction}.
 
+\subsubsection{Natural Gradient Boosting (NGBoost)}
+Having introduced \gls{gbr}, we now give an overview of \gls{ngboost}.
+
+\gls{ngboost} is a variant of the gradient boosting algorithm that leverages the concept of natural gradients with the goal of improving convergence speed and model performance.
+In more complex models, the parameter space can be curved and thus non-Euclidean, making the standard gradient descent less effective.
+Consequently, using the standard gradient descent can lead to slow convergence and suboptimal performance.
+This is where natural gradients come in.
+
+Natural gradients account for the underlying geometry of the parameter space by using information about its curvature.
+By incorporating this information, natural gradients can navigate the parameter space more efficiently, leading to faster convergence and better performance.
+In addition, \gls{ngboost} provides its predictions in the form of probability distributions, allowing it to estimate the uncertainty associated with its predictions.
+
+The algorithm starts by initializing a model with a guess for the parameters of the probability distribution, usually starting with something simple like a Gaussian distribution.
+This initial model prediction represents the probability distribution over the target variable based on the given features.
+
+Then, the algorithm enters an iterative process to refine its predictions.
+At the start of each iteration, the model computes its current predictions using the existing set of parameters.
+The algorithm then calculates the negative gradient of the loss function with respect to the current predictions.
+This involves computing the gradient of the negative log-likelihood, which quantifies the discrepancy between the current predictions and the actual observed data.
+The negative log-likelihood measures how well the model's predicted probability distribution matches the observed data, with lower values indicating better alignment between predictions and observations. 
+
+Next, the \textit{Fisher information matrix} is computed. 
+This matrix encodes the curvature of the parameter space at the current parameter values, reflecting how sensitive the likelihood function is to changes in these parameters.
+For example, if the likelihood function is highly sensitive to changes in a particular parameter, the Fisher information matrix will have a high value for that parameter.
+Using this information, the model can adjust its parameters more effectively, focusing on the most sensitive parameters to improve performance.
+
+The standard gradient, or residuals, which is derived from the negative log-likelihood, is then transformed using the inverse of the Fisher information matrix to obtain what is known as the natural gradient.
+Next, a weak learner, typically a decision tree, is fitted to these natural gradients.
+This step is similar to traditional gradient boosting, where a tree is fitted to the residuals, but in \gls{ngboost}, the tree is fitted to the natural gradients instead.
+
+The parameters of the model are then updated using the output from the weak learner.
+This update process incorporates a learning rate to control the step size, ensuring that the model makes gradual improvements rather than drastic changes.
+
+Using the newly updated parameters, the model recalculates its predictions, refining the probability distribution of the target variable. 
+This iterative process of computing predictions, calculating gradients, fitting weak learners, and updating parameters continues for a predetermined number of iterations or until the model's performance converges.
 
 \subsubsection{XGBoost}
 

From 6abef24e0bf9b5e7662f8f7c6ebaf21ef6156338 Mon Sep 17 00:00:00 2001
From: Pattrigue <pfas19@student.aau.dk>
Date: Tue, 28 May 2024 14:14:22 +0200
Subject: [PATCH 2/4] add citation

---
 report_thesis/src/references.bib          | 19 ++++++++++++++++++-
 report_thesis/src/sections/background.tex |  2 +-
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/report_thesis/src/references.bib b/report_thesis/src/references.bib
index 571881d3..26312540 100644
--- a/report_thesis/src/references.bib
+++ b/report_thesis/src/references.bib
@@ -563,4 +563,21 @@ @book{learningwithkernels
     isbn = {9780262256933},
     doi = {10.7551/mitpress/4175.001.0001},
     url = {https://doi.org/10.7551/mitpress/4175.001.0001},
-}
\ No newline at end of file
+}
+
+
+@misc{duan_ngboost_2020,
+	title = {{NGBoost}: {Natural} {Gradient} {Boosting} for {Probabilistic} {Prediction}},
+	shorttitle = {{NGBoost}},
+	url = {http://arxiv.org/abs/1910.03225},
+	abstract = {We present Natural Gradient Boosting (NGBoost), an algorithm for generic probabilistic prediction via gradient boosting. Typical regression models return a point estimate, conditional on covariates, but probabilistic regression models output a full probability distribution over the outcome space, conditional on the covariates. This allows for predictive uncertainty estimation — crucial in applications like healthcare and weather forecasting. NGBoost generalizes gradient boosting to probabilistic regression by treating the parameters of the conditional distribution as targets for a multiparameter boosting algorithm. Furthermore, we show how the Natural Gradient is required to correct the training dynamics of our multiparameter boosting approach. NGBoost can be used with any base learner, any family of distributions with continuous parameters, and any scoring rule. NGBoost matches or exceeds the performance of existing methods for probabilistic prediction while offering additional beneﬁts in ﬂexibility, scalability, and usability. An open-source implementation is available at github.com/stanfordmlgroup/ngboost.},
+	language = {en},
+	urldate = {2024-05-28},
+	publisher = {arXiv},
+	author = {Duan, Tony and Avati, Anand and Ding, Daisy Yi and Thai, Khanh K. and Basu, Sanjay and Ng, Andrew Y. and Schuler, Alejandro},
+	month = jun,
+	year = {2020},
+	note = {arXiv:1910.03225 [cs, stat]},
+	keywords = {Computer Science - Machine Learning, Statistics - Machine Learning},
+	annote = {Comment: Accepted for ICML 2020},
+}
diff --git a/report_thesis/src/sections/background.tex b/report_thesis/src/sections/background.tex
index 7278d180..5f576b83 100644
--- a/report_thesis/src/sections/background.tex
+++ b/report_thesis/src/sections/background.tex
@@ -358,7 +358,7 @@ \subsubsection{Gradient Boosting Regression (GBR)}\label{sec:gradientboost}
 To minimize errors, gradient descent is used to iteratively update model parameters in the direction of the negative gradient of the loss function, thereby following the path of steepest descent~\cite{gradientLossFunction}.
 
 \subsubsection{Natural Gradient Boosting (NGBoost)}
-Having introduced \gls{gbr}, we now give an overview of \gls{ngboost}.
+Having introduced \gls{gbr}, we now give an overview of \gls{ngboost} based on \citet{duan_ngboost_2020}.
 
 \gls{ngboost} is a variant of the gradient boosting algorithm that leverages the concept of natural gradients with the goal of improving convergence speed and model performance.
 In more complex models, the parameter space can be curved and thus non-Euclidean, making the standard gradient descent less effective.

From 694521d56ccbcb9bded6e2489eca9ff118e33346 Mon Sep 17 00:00:00 2001
From: Pattrigue <57709490+Pattrigue@users.noreply.github.com>
Date: Tue, 28 May 2024 19:54:43 +0200
Subject: [PATCH 3/4] Update report_thesis/src/sections/background.tex

Co-authored-by: Ivikhostrup <56341364+Ivikhostrup@users.noreply.github.com>
---
 report_thesis/src/sections/background.tex | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/report_thesis/src/sections/background.tex b/report_thesis/src/sections/background.tex
index 5f576b83..53e32eaa 100644
--- a/report_thesis/src/sections/background.tex
+++ b/report_thesis/src/sections/background.tex
@@ -376,7 +376,7 @@ \subsubsection{Natural Gradient Boosting (NGBoost)}
 At the start of each iteration, the model computes its current predictions using the existing set of parameters.
 The algorithm then calculates the negative gradient of the loss function with respect to the current predictions.
 This involves computing the gradient of the negative log-likelihood, which quantifies the discrepancy between the current predictions and the actual observed data.
-The negative log-likelihood measures how well the model's predicted probability distribution matches the observed data, with lower values indicating better alignment between predictions and observations. 
+The negative log-likelihood quantifies how well the model's predicted probability distribution matches the observed data, with lower values indicating better alignment between predictions and observations. 
 
 Next, the \textit{Fisher information matrix} is computed. 
 This matrix encodes the curvature of the parameter space at the current parameter values, reflecting how sensitive the likelihood function is to changes in these parameters.

From 4e269d2e18358a6fc2baaa80a93ed0cfe0f8aa67 Mon Sep 17 00:00:00 2001
From: Ivikhostrup <56341364+Ivikhostrup@users.noreply.github.com>
Date: Tue, 28 May 2024 19:58:41 +0200
Subject: [PATCH 4/4] Update report_thesis/src/sections/background.tex

Co-authored-by: Pattrigue <57709490+Pattrigue@users.noreply.github.com>
---
 report_thesis/src/sections/background.tex | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/report_thesis/src/sections/background.tex b/report_thesis/src/sections/background.tex
index 53e32eaa..253d8d35 100644
--- a/report_thesis/src/sections/background.tex
+++ b/report_thesis/src/sections/background.tex
@@ -363,7 +363,7 @@ \subsubsection{Natural Gradient Boosting (NGBoost)}
 \gls{ngboost} is a variant of the gradient boosting algorithm that leverages the concept of natural gradients with the goal of improving convergence speed and model performance.
 In more complex models, the parameter space can be curved and thus non-Euclidean, making the standard gradient descent less effective.
 Consequently, using the standard gradient descent can lead to slow convergence and suboptimal performance.
-This is where natural gradients come in.
+In such scenarios, the application of natural gradients becomes particularly advantageous.
 
 Natural gradients account for the underlying geometry of the parameter space by using information about its curvature.
 By incorporating this information, natural gradients can navigate the parameter space more efficiently, leading to faster convergence and better performance.