Skip to content

Commit

Permalink
Merge pull request #83 from chhoumann/thesis-introduction
Browse files Browse the repository at this point in the history
[KB-105] Thesis P10 introduction
  • Loading branch information
Ivikhostrup authored Feb 20, 2024
2 parents 5b60a44 + 6421674 commit 255cada
Show file tree
Hide file tree
Showing 5 changed files with 155 additions and 7 deletions.
3 changes: 3 additions & 0 deletions report_thesis/src/_preamble.tex
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,9 @@
\usepackage{graphicx}
\usepackage{subcaption}
\usepackage{afterpage}
\usepackage[acronym]{glossaries}

\input{glossary.tex}

% ACM cleanup
\settopmatter{printacmref=false} % Removes citation information below abstract
Expand Down
11 changes: 11 additions & 0 deletions report_thesis/src/glossary.tex
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
\newacronym{msl}{MSL}{Mars Science Laboratory}
\newacronym{libs}{LIBS}{Laser-Induced Breakdown Spectroscopy}
\newacronym{chemcam}{ChemCam}{Chemistry and Camera}
\newacronym{mer}{MER}{Mars Exploration Rover}
\newacronym{pls}{PLS}{Partial Least Squares}
\newacronym{ica}{ICA}{Independent Component Analysis}
\newacronym{moc}{MOC}{Multivariate Oxide Composition}
\newacronym{ann}{ANN}{Artificial Neural Network}
\newacronym{gbr}{GBR}{Gradient Boosting Regression}
\newacronym{rf}{RF}{Random Forest}
\newacronym{lasso}{LASSO}{Least Absolute Selection and Shrinkage Operator}
9 changes: 2 additions & 7 deletions report_thesis/src/index.tex
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,9 @@

\maketitle

\subsubsection*{Acknowledgements:}
\subsubsection*{Acknowledgements:}


\section{Introduction}
Brief introduction to the field of study.
Importance of establishing benchmarks.
Objectives of the first part of the project.
The whole story – longer version but still short, also motivates
\input{sections/introduction.tex}

\section{Background}
Background / Preliminaries (what you need to know in order to understand the story)
Expand Down
95 changes: 95 additions & 0 deletions report_thesis/src/references.bib
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
@article{p9_paper,
title = {Identifying Limitations in the {ChemCam} Multivariate Oxide Composition Model for Elemental Quantification in Martian Geological Samples},
author = {Houmann, Christian Bager Bach and Østergaard, Patrick Frostholm and Hostrup, Ivik Lau Dalgas},
date = {2024-01-31},
url = {https://vbn-aau-dk.zorac.aub.aau.dk/ws/files/659161040/p9_report_31_01_24.pdf},
langid = {english},
}

@article{andersonImprovedAccuracyQuantitative2017,
title = {Improved Accuracy in Quantitative Laser-Induced Breakdown Spectroscopy Using Sub-Models},
author = {Anderson, Ryan B. and Clegg, Samuel M. and Frydenvang, Jens and Wiens, Roger C. and McLennan, Scott and Morris, Richard V. and Ehlmann, Bethany and Dyar, M. Darby},
date = {2017-03-01},
journaltitle = {Spectrochimica Acta Part B: Atomic Spectroscopy},
shortjournal = {Spectrochimica Acta Part B: Atomic Spectroscopy},
volume = {129},
pages = {49--57},
issn = {0584-8547},
doi = {10.1016/j.sab.2016.12.002},
url = {https://www.sciencedirect.com/science/article/pii/S0584854716303925},
urldate = {2023-02-11},
abstract = {Accurate quantitative analysis of diverse geologic materials is one of the primary challenges faced by the laser-induced breakdown spectroscopy (LIBS)-based ChemCam instrument on the Mars Science Laboratory (MSL) rover. The SuperCam instrument on the Mars 2020 rover, as well as other LIBS instruments developed for geochemical analysis on Earth or other planets, will face the same challenge. Consequently, part of the ChemCam science team has focused on the development of improved multivariate analysis calibrations methods. Developing a single regression model capable of accurately determining the composition of very different target materials is difficult because the response of an element's emission lines in LIBS spectra can vary with the concentration of other elements. We demonstrate a conceptually simple “sub-model” method for improving the accuracy of quantitative LIBS analysis of diverse target materials. The method is based on training several regression models on sets of targets with limited composition ranges and then “blending” these “sub-models” into a single final result. Tests of the sub-model method show improvement in test set root mean squared error of prediction (RMSEP) for almost all cases. The sub-model method, using partial least squares (PLS) regression, is being used as part of the current ChemCam quantitative calibration, but the sub-model method is applicable to any multivariate regression method and may yield similar improvements.},
langid = {english}
}

@article{cleggRecalibrationMarsScience2017,
title = {Recalibration of the {{Mars Science Laboratory ChemCam}} Instrument with an Expanded Geochemical Database},
author = {Clegg, Samuel M. and Wiens, Roger C. and Anderson, Ryan and Forni, Olivier and Frydenvang, Jens and Lasue, Jeremie and Cousin, Agnes and Payré, Valérie and Boucher, Tommy and Dyar, M. Darby and McLennan, Scott M. and Morris, Richard V. and Graff, Trevor G. and Mertzman, Stanley A. and Ehlmann, Bethany L. and Belgacem, Ines and Newsom, Horton and Clark, Ben C. and Melikechi, Noureddine and Mezzacappa, Alissa and McInroy, Rhonda E. and Martinez, Ronald and Gasda, Patrick and Gasnault, Olivier and Maurice, Sylvestre},
date = {2017-03-01},
journaltitle = {Spectrochimica Acta Part B: Atomic Spectroscopy},
shortjournal = {Spectrochimica Acta Part B: Atomic Spectroscopy},
volume = {129},
pages = {64--85},
issn = {0584-8547},
doi = {10.1016/j.sab.2016.12.003},
url = {https://www.sciencedirect.com/science/article/pii/S0584854716303913},
urldate = {2023-02-11},
abstract = {The ChemCam Laser-Induced Breakdown Spectroscopy (LIBS) instrument onboard the Mars Science Laboratory (MSL) rover Curiosity has obtained {$>$}300,000 spectra of rock and soil analysis targets since landing at Gale Crater in 2012, and the spectra represent perhaps the largest publicly-available LIBS datasets. The compositions of the major elements, reported as oxides (SiO2, TiO2, Al2O3, FeOT, MgO, CaO, Na2O, K2O), have been re-calibrated using a laboratory LIBS instrument, Mars-like atmospheric conditions, and a much larger set of standards (408) that span a wider compositional range than previously employed. The new calibration uses a combination of partial least squares (PLS1) and Independent Component Analysis (ICA) algorithms, together with a calibration transfer matrix to minimize differences between the conditions under which the standards were analyzed in the laboratory and the conditions on Mars. While the previous model provided good results in the compositional range near the average Mars surface composition, the new model fits the extreme compositions far better. Examples are given for plagioclase feldspars, where silicon was significantly over-estimated by the previous model, and for calcium-sulfate veins, where silicon compositions near zero were inaccurate. The uncertainties of major element abundances are described as a function of the abundances, and are overall significantly lower than the previous model, enabling important new geochemical interpretations of the data.},
langid = {english}
}

@article{andersonPostlandingMajorElement2022,
title = {Post-Landing Major Element Quantification Using {{SuperCam}} Laser Induced Breakdown Spectroscopy},
author = {Anderson, Ryan B. and Forni, Olivier and Cousin, Agnes and Wiens, Roger C. and Clegg, Samuel M. and Frydenvang, Jens and Gabriel, Travis S. J. and Ollila, Ann and Schröder, Susanne and Beyssac, Olivier and Gibbons, Erin and Vogt, David S. and Clavé, Elise and Manrique, Jose-Antonio and Legett, Carey and Pilleri, Paolo and Newell, Raymond T. and Sarrao, Joseph and Maurice, Sylvestre and Arana, Gorka and Benzerara, Karim and Bernardi, Pernelle and Bernard, Sylvain and Bousquet, Bruno and Brown, Adrian J. and Alvarez-Llamas, César and Chide, Baptiste and Cloutis, Edward and Comellas, Jade and Connell, Stephanie and Dehouck, Erwin and Delapp, Dorothea M. and Essunfeld, Ari and Fabre, Cecile and Fouchet, Thierry and Garcia-Florentino, Cristina and García-Gómez, Laura and Gasda, Patrick and Gasnault, Olivier and Hausrath, Elisabeth M. and Lanza, Nina L. and Laserna, Javier and Lasue, Jeremie and Lopez, Guillermo and Madariaga, Juan Manuel and Mandon, Lucia and Mangold, Nicolas and Meslin, Pierre-Yves and Nelson, Anthony E. and Newsom, Horton and Reyes-Newell, Adriana L. and Robinson, Scott and Rull, Fernando and Sharma, Shiv and Simon, Justin I. and Sobron, Pablo and Fernandez, Imanol Torre and Udry, Arya and Venhaus, Dawn and McLennan, Scott M. and Morris, Richard V. and Ehlmann, Bethany},
date = {2022-02-01},
journaltitle = {Spectrochimica Acta Part B: Atomic Spectroscopy},
shortjournal = {Spectrochimica Acta Part B: Atomic Spectroscopy},
volume = {188},
pages = {106347},
issn = {0584-8547},
doi = {10.1016/j.sab.2021.106347},
url = {https://www.sciencedirect.com/science/article/pii/S0584854721003049},
urldate = {2023-05-15},
abstract = {The SuperCam instrument on the Perseverance Mars 2020 rover uses a pulsed 1064~nm laser to ablate targets at a distance and conduct laser induced breakdown spectroscopy (LIBS) by analyzing the light from the resulting plasma. SuperCam LIBS spectra are preprocessed to remove ambient light, noise, and the continuum signal present in LIBS observations. Prior to quantification, spectra are masked to remove noisier spectrometer regions and spectra are normalized to minimize signal fluctuations and effects of target distance. In some cases, the spectra are also standardized or binned prior to quantification. To determine quantitative elemental compositions of diverse geologic materials at Jezero crater, Mars, we use a suite of 1198 laboratory spectra of 334 well-characterized reference samples. The samples were selected to span a wide range of compositions and include typical silicate rocks, pure minerals (e.g., silicates, sulfates, carbonates, oxides), more unusual compositions (e.g., Mn ore and sodalite), and replicates of the sintered SuperCam calibration targets (SCCTs) onboard the rover. For each major element (SiO2, TiO2, Al2O3, FeOT, MgO, CaO, Na2O, K2O), the database was subdivided into five “folds” with similar distributions of the element of interest. One fold was held out as an independent test set, and the remaining four folds were used to optimize multivariate regression models relating the spectrum to the composition. We considered a variety of models, and selected several for further investigation for each element, based primarily on the root mean squared error of prediction (RMSEP) on the test set, when analyzed at 3~m. In cases with several models of comparable performance at 3~m, we incorporated the SCCT performance at different distances to choose the preferred model. Shortly after landing on Mars and collecting initial spectra of geologic targets, we selected one model per element. Subsequently, with additional data from geologic targets, some models were revised to ensure results that are more consistent with geochemical constraints. The calibration discussed here is a snapshot of an ongoing effort to deliver the most accurate chemical compositions with SuperCam LIBS.},
langid = {english}
}

@online{marsnasagov_vikings,
title = {Viking 1 \& 2 {\textbar} Missions},
url = {https://mars.nasa.gov/mars-exploration/missions/viking-1-2},
abstract = {{NASA}'s real-time portal for Mars exploration, featuring the latest news, images, and discoveries from the Red Planet.},
titleaddon = {{NASA} Mars Exploration},
author = {mars.nasa.gov},
urldate = {2024-01-23},
langid = {english}
}

@online{marsnasagov_observer,
title = {Mars Observer {\textbar} Missions},
url = {https://mars.nasa.gov/mars-exploration/missions/mars-observer},
abstract = {{NASA}'s real-time portal for Mars exploration, featuring the latest news, images, and discoveries from the Red Planet.},
titleaddon = {{NASA} Mars Exploration},
author = {mars.nasa.gov},
urldate = {2024-01-23},
langid = {english}
}

@misc{marsnasagov_chemcam,
title = {ChemCam},
url = {https://mars.nasa.gov/msl/spacecraft/instruments/chemcam/},
journal = {NASA},
publisher = {NASA},
author = {Lanza, Nina},
year = {2022},
month = {May}
}

@online{marsnasagov_spirit_opportunity,
title = {Mars Exploration Rovers {\textbar} Missions},
url = {https://mars.nasa.gov/mars-exploration/missions/mars-exploration-rovers},
abstract = {{NASA}'s real-time portal for Mars exploration, featuring the latest news, images, and discoveries from the Red Planet.},
titleaddon = {{NASA} Mars Exploration},
author = {mars.nasa.gov},
urldate = {2024-01-23},
langid = {english}
}
44 changes: 44 additions & 0 deletions report_thesis/src/sections/introduction.tex
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
\section{Introduction}\label{sec:introduction}
The NASA Viking missions in the 1970s were the first to successfully land a rover on Mars, aiming to determine if life existed on the planet.
One experiment suggested the presence of life, but the results were ambiguous and inconclusive, and NASA was unable to repeat the experiment due to budget constraints\cite{marsnasagov_vikings}.

A few decades later, the philosophy of Martian exploration had shifted from searching for life to investigating whether Mars ever had the conditions to support life as we know it.
The \gls{mer} mission, which included the Spirit and Opportunity rovers, discovered clear evidence that water once flowed on Mars.
As water alone is not enough to support life, NASA shifted their focus to search for organic material as well\cite{marsnasagov_observer, marsnasagov_spirit_opportunity}.

The Curiosity rover landed on Mars in August 2012 inside Gale Crater as part of the \gls{msl} mission with this very purpose.
Its sophisticated equipment quickly discovered that the conditions to support life as we know it had existed on Mars through chemical and mineral evidence.\cite{marsnasagov_chemcam}

One of the instruments aboard the rover is the \gls{chemcam} instrument, which is a remote-sensing laser instrument used to gather \gls{libs} data from geological samples on Mars.
\gls{libs} is a non-invasive technique that enables rapid analysis without the need for sample preparation by using a laser to ablate and remove surface contaminants to expose the underlying material and generate a plasma plume from the now-exposed sample material.
This plasma plume emits light that is captured through three distinct spectrometers to collect a series of spectral readings.
These spectra consist of emission lines that can be associated with the concentration of a specific element, and their intensity reflects the concentration of that element in the sample.
Consequently, a spectra serves as a complex, multi-dimensional fingerprint of the elemental composition of the examined geological formations.\cite{cleggRecalibrationMarsScience2017}

Analyzing \gls{libs} data is computationally challenging due to high multicollinearity within spectral data, which diminishes the effectiveness of traditional linear analysis.
The multicollinearity, which stems from correlations among spectral channels and elemental emission characteristics, complicates data interpretation.
Additionally, \textit{matrix effects} in \gls{libs} spectra arise when various physical interactions cause emission line intensities to change without a corresponding shift in the element's actual concentration
This phenomenon introduces variability that complicates the straightforward interpretation of spectral data, challenging the accuracy of computational models tasked with predicting elemental composition.\cite{andersonImprovedAccuracyQuantitative2017}

For analyzing Martian geological samples, the \gls{chemcam} team currently uses the \gls{moc} model\cite{cleggRecalibrationMarsScience2017}.
This model integrates \gls{pls} and \gls{ica} to predict the composition of major oxides.
Though the MOC model has proven useful, it suffers from limitations in predictive accuracy and robustness.
In \citet{p9_paper}, we created a replica of the MOC model and identified which components were responsible for these limitations.
Through a series of comparative experiments, we showed that the model selection was the primary cause of these limitations, and we showed how both \gls{ann} and \gls{gbr} methods could be used to improve the model's predictive accuracy and robustness.

This is further underscored by work from the SuperCam team.
In 2021, the Perseverance rover landed on Mars, equipped with the SuperCam instrument, which is the successor to the \gls{chemcam} instrument.
As part of the ongoing work to support the SuperCam instrument, \citet{andersonPostlandingMajorElement2022} experimented with various machine learning models to predict the composition of major oxides in geological samples using the SuperCam \gls{libs} calibration dataset.
While the team decided to retain \gls{pls} for analyzing certain oxides, \gls{ica} was entirely discontinued.
Instead, models based on \gls{gbr}, \gls{rf}, and \gls{lasso} were selected for other oxides.
This decision reinforces our finding that \gls{ica} regression models fall short in accurately predicting the composition of major oxides in geological samples.
Consistent with our observations, \gls{gbr} was also identified as a high-performing model in their analyses.

However, there remains considerable uncertainty about which machine learning techniques best predict the composition of major oxides in Martian geological samples using \gls{libs} data.
This underscores the importance of a detailed study into advanced machine learning models for improving predictions in these applications.

\textit{In this work, we aim to investigate the application of advanced machine learning models to predict the composition of major oxides in Martian geological samples using \gls{libs} data.}

The remainder of this paper is organized as follows:
\textit{Structure of the paper will be added here after the paper is written.}
% TODO: Describe the structure of the paper here.

0 comments on commit 255cada

Please sign in to comment.