Skip to content

Commit

Permalink
Finished overview chapter
Browse files Browse the repository at this point in the history
  • Loading branch information
oliviaAB committed Mar 14, 2024
1 parent 626ebfa commit 8e38f06
Show file tree
Hide file tree
Showing 4 changed files with 36 additions and 31 deletions.
21 changes: 11 additions & 10 deletions docs/overview.html
Original file line number Diff line number Diff line change
Expand Up @@ -309,8 +309,11 @@ <h1 class="title"><span id="sec-overview" class="quarto-section-identifier d-non


</header><div class="cell">
<div class="sourceCode" id="cb1"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="kw"><a href="https://rdrr.io/r/base/library.html">library</a></span><span class="op">(</span><span class="va"><a href="https://docs.ropensci.org/targets/">targets</a></span><span class="op">)</span></span>
<span><span class="kw"><a href="https://rdrr.io/r/base/library.html">library</a></span><span class="op">(</span><span class="va"><a href="https://github.com/PlantandFoodResearch/moiraine">moiraine</a></span><span class="op">)</span></span>

</div>
<p>In this chapter, we provide an overview of the capabilities of <code>moiraine</code>. Some code is presented for illustration, but for more details readers will be directed towards the corresponding chapter in the manual.</p>
<div class="cell">
<details><summary>Loading packages</summary><div class="sourceCode" id="cb1"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="kw"><a href="https://rdrr.io/r/base/library.html">library</a></span><span class="op">(</span><span class="va"><a href="https://github.com/PlantandFoodResearch/moiraine">moiraine</a></span><span class="op">)</span></span>
<span></span>
<span><span class="co">## For custom colour palettes</span></span>
<span><span class="kw"><a href="https://rdrr.io/r/base/library.html">library</a></span><span class="op">(</span><span class="va"><a href="https://ggplot2.tidyverse.org">ggplot2</a></span><span class="op">)</span></span>
Expand All @@ -321,9 +324,7 @@ <h1 class="title"><span id="sec-overview" class="quarto-section-identifier d-non
<span></span>
<span><span class="co">## For visualising sO2PLS summary</span></span>
<span><span class="kw"><a href="https://rdrr.io/r/base/library.html">library</a></span><span class="op">(</span><span class="va">OmicsPLS</span><span class="op">)</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</div>
<div class="cell">

</details>
</div>
<section id="input-data" class="level2" data-number="1.1"><h2 data-number="1.1" class="anchored" data-anchor-id="input-data">
<span class="header-section-number">1.1</span> Input data</h2>
Expand Down Expand Up @@ -395,7 +396,7 @@ <h1 class="title"><span id="sec-overview" class="quarto-section-identifier d-non
<p>Similarly, a number of functions allow to quickly summarise different aspects of the multi-omics dataset, such as creating an upset plot to compare the samples present in each omics dataset (<code><a href="https://rdrr.io/pkg/moiraine/man/plot_samples_upset.html">plot_samples_upset()</a></code>), or generating a density plot for each omics dataset (<code><a href="https://rdrr.io/pkg/moiraine/man/plot_density_data.html">plot_density_data()</a></code>). See <a href="inspecting_multidataset.html"><span>Chapter&nbsp;4</span></a> for more details about the different visualisations and summary functions implemented.</p>
</section><section id="data-pre-processing" class="level2" data-number="1.2"><h2 data-number="1.2" class="anchored" data-anchor-id="data-pre-processing">
<span class="header-section-number">1.2</span> Data pre-processing</h2>
<p>Target factories have been implemented to facilitate the application of similar tasks across the different omics datasets. For example, the <code><a href="https://rdrr.io/pkg/moiraine/man/transformation_datasets_factory.html">transformation_datasets_factory()</a></code> function generates a sequence of targets to apply one of many possible transformations (from the <code>vsn</code>, <code>DESeq2</code>, or <code>bestNormalize</code> packages, for example) on each omics dataset, store information about each transformation performed, and generate a new <code>MultiDataSet</code> object in which the omics measurements have been transformed:</p>
<p>Target factories have been implemented to facilitate the application of similar tasks across the different omics datasets. For example, the <code><a href="https://rdrr.io/pkg/moiraine/man/transformation_datasets_factory.html">transformation_datasets_factory()</a></code> function generates a sequence of targets to apply one of many possible transformations (from the <a href="https://bioconductor.org/packages/release/bioc/html/vsn.html"><code>vsn</code></a>, <a href="https://bioconductor.org/packages/release/bioc/html/DESeq2.html"><code>DESeq2</code></a>, or <a href="https://cran.r-project.org/web/packages/bestNormalize/index.html"><code>bestNormalize</code></a> packages, for example) on each omics dataset, store information about each transformation performed, and generate a new <code>MultiDataSet</code> object in which the omics measurements have been transformed:</p>
<details><summary>
Code
</summary><div class="targets-chunk">
Expand All @@ -412,7 +413,7 @@ <h1 class="title"><span id="sec-overview" class="quarto-section-identifier d-non
</div>
</details><p><img src="images/dag_transformation.png" class="img-fluid"></p>
<p>Note that there is also the option for users to apply their own custom transformations to the datasets (see <a href="modifying_multidataset.html"><span>Chapter&nbsp;5</span></a>).</p>
<p>Similarly, the <code>pca_complete_data_factory</code> generates a list of targets to run a PCA on each omics dataset via the <code>pcaMethods</code> package, and if necessary imputes missing values through NIPALS-PCA. The PCA results can be easily visualised for all or specific omics datasets:</p>
<p>Similarly, the <code>pca_complete_data_factory</code> generates a list of targets to run a PCA on each omics dataset via the <a href="https://bioconductor.org/packages/release/bioc/html/pcaMethods.html"><code>pcaMethods</code> package</a>, and if necessary imputes missing values through NIPALS-PCA. The PCA results can be easily visualised for all or specific omics datasets:</p>
<div class="cell">
<details><summary>Code</summary><div class="sourceCode" id="cb5"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fu"><a href="https://rdrr.io/pkg/moiraine/man/plot_screeplot_pca.html">plot_screeplot_pca</a></span><span class="op">(</span><span class="va">pca_runs_list</span><span class="op">)</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details><div class="cell-output-display">
Expand All @@ -438,7 +439,7 @@ <h1 class="title"><span id="sec-overview" class="quarto-section-identifier d-non
<p>More information about data pre-processing can be found in <a href="preprocessing.html"><span>Chapter&nbsp;6</span></a>.</p>
</section><section id="data-pre-filtering" class="level2" data-number="1.3"><h2 data-number="1.3" class="anchored" data-anchor-id="data-pre-filtering">
<span class="header-section-number">1.3</span> Data pre-filtering</h2>
<p>The created <code>MultiDataSet</code> object can be filtered, both in terms of samples and features, by passing a list of sample or feature IDs to retain, or by using logical tests on samples or features metadata. In addition, we implement target factories to retain only the most variable features in each omics dataset –unsupervised filtering–, or to retain the features most associated with an outcome of interest, via sPLS-DA from <code>mixOmics</code> –supervised filtering– (see <a href="prefiltering.html"><span>Chapter&nbsp;7</span></a>). This pre-filtering step is essential to reduce the size of the datasets prior to multi-omics integration.</p>
<p>The created <code>MultiDataSet</code> object can be filtered, both in terms of samples and features, by passing a list of sample or feature IDs to retain, or by using logical tests on samples or features metadata. In addition, we implement target factories to retain only the most variable features in each omics dataset –unsupervised filtering–, or to retain the features most associated with an outcome of interest, via sPLS-DA from <a href="http://mixomics.org/"><code>mixOmics</code></a> –supervised filtering– (see <a href="prefiltering.html"><span>Chapter&nbsp;7</span></a>). This pre-filtering step is essential to reduce the size of the datasets prior to multi-omics integration.</p>
<details><summary>
Code
</summary><div class="targets-chunk">
Expand Down Expand Up @@ -475,7 +476,7 @@ <h1 class="title"><span id="sec-overview" class="quarto-section-identifier d-non
</div>
</section><section id="multi-omics-data-integration" class="level2" data-number="1.4"><h2 data-number="1.4" class="anchored" data-anchor-id="multi-omics-data-integration">
<span class="header-section-number">1.4</span> Multi-omics data integration</h2>
<p>Currently, <code>moiraine</code> provides functions and target factories to facilitate the use of five integration methods: sPLS and DIABLO from the <code>mixOmics</code> package, sO2PLS from <code>OmicsPLS</code>, as well as <code>MOFA</code> and <code>MEFISTO</code> from <code>MOFA2</code>.</p>
<p>Currently, <code>moiraine</code> provides functions and target factories to facilitate the use of five integration methods: sPLS and DIABLO from the <code>mixOmics</code> package, sO2PLS from <a href="https://cran.r-project.org/web/packages/OmicsPLS/index.html"><code>OmicsPLS</code></a>, as well as <code>MOFA</code> and <code>MEFISTO</code> from <a href="https://biofam.github.io/MOFA2/"><code>MOFA2</code></a>.</p>
<p>This includes functions that transform a <code>MultiDataSet</code> object into the required input format for each integration method; for example for sPLS (only top of the matrices shown):</p>
<div class="cell">
<details><summary>Code</summary><div class="sourceCode" id="cb9"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fu"><a href="https://rdrr.io/pkg/moiraine/man/get_input_spls.html">get_input_spls</a></span><span class="op">(</span></span>
Expand Down Expand Up @@ -514,7 +515,7 @@ <h1 class="title"><span id="sec-overview" class="quarto-section-identifier d-non
</summary><div class="targets-chunk">
<div class="cell">
<div class="sourceCode" id="cb11"><pre class="downlit sourceCode r code-with-copy"><code class="sourceCode R"><span><span class="fu"><a href="https://rdrr.io/r/base/list.html">list</a></span><span class="op">(</span></span>
<span> <span class="fu"><a href="https://docs.ropensci.org/targets/reference/tar_target.html">tar_target</a></span><span class="op">(</span></span>
<span> <span class="fu">tar_target</span><span class="op">(</span></span>
<span> <span class="va">diablo_input</span>, <span class="co"># DIABLO input object</span></span>
<span> <span class="fu"><a href="https://rdrr.io/pkg/moiraine/man/get_input_mixomics_supervised.html">get_input_mixomics_supervised</a></span><span class="op">(</span></span>
<span> <span class="va">mo_presel_supervised</span>, <span class="co"># MultiDataSet object (prefiltered)</span></span>
Expand Down
Binary file modified docs/overview_files/figure-html/show-features-weight-comp-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion docs/sitemap.xml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
</url>
<url>
<loc>https://solid-lamp-kq546rq.pages.github.io/overview.html</loc>
<lastmod>2024-03-13T23:19:52.823Z</lastmod>
<lastmod>2024-03-14T20:08:18.886Z</lastmod>
</url>
<url>
<loc>https://solid-lamp-kq546rq.pages.github.io/example_dataset.html</loc>
Expand Down
44 changes: 24 additions & 20 deletions overview.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,29 @@ library(purrr)
library(OmicsPLS)
```

```{r loading-data}
#| echo: false
mo_set <- tar_read(mo_set_de)
tar_load(interesting_features)
tar_load(pca_runs_list)
tar_load(mo_presel_supervised)
tar_load(so2pls_final_run)
tar_load(diablo_final_run)
mofa_output <- tar_read(mofa_output) |>
moiraine:::.filter_output_dimensions(paste("Factor", 1:4))
tar_load(diablo_output)
tar_load(output_list)
```

In this chapter, we provide an overview of the capabilities of `moiraine`. Some code is presented for illustration, but for more details readers will be directed towards the corresponding chapter in the manual.


```{r setup-visible}
#| code-fold: true
#| code-summary: "Loading packages"
#| eval: false
library(targets)
library(moiraine)
## For custom colour palettes
Expand All @@ -32,21 +51,6 @@ library(purrr)
library(OmicsPLS)
```

```{r loading-data}
#| echo: false
mo_set <- tar_read(mo_set_de)
tar_load(interesting_features)
tar_load(pca_runs_list)
tar_load(mo_presel_supervised)
tar_load(so2pls_final_run)
tar_load(diablo_final_run)
mofa_output <- tar_read(mofa_output) |>
moiraine:::.filter_output_dimensions(paste("Factor", 1:4))
tar_load(diablo_output)
tar_load(output_list)
```


## Input data

Expand Down Expand Up @@ -109,7 +113,7 @@ Similarly, a number of functions allow to quickly summarise different aspects of

## Data pre-processing

Target factories have been implemented to facilitate the application of similar tasks across the different omics datasets. For example, the `transformation_datasets_factory()` function generates a sequence of targets to apply one of many possible transformations (from the `vsn`, `DESeq2`, or `bestNormalize` packages, for example) on each omics dataset, store information about each transformation performed, and generate a new `MultiDataSet` object in which the omics measurements have been transformed:
Target factories have been implemented to facilitate the application of similar tasks across the different omics datasets. For example, the `transformation_datasets_factory()` function generates a sequence of targets to apply one of many possible transformations (from the [`vsn`](https://bioconductor.org/packages/release/bioc/html/vsn.html), [`DESeq2`](https://bioconductor.org/packages/release/bioc/html/DESeq2.html), or [`bestNormalize`](https://cran.r-project.org/web/packages/bestNormalize/index.html) packages, for example) on each omics dataset, store information about each transformation performed, and generate a new `MultiDataSet` object in which the omics measurements have been transformed:

<details>

Expand All @@ -136,7 +140,7 @@ transformation_datasets_factory(

Note that there is also the option for users to apply their own custom transformations to the datasets (see @sec-modifying-multidataset).

Similarly, the `pca_complete_data_factory` generates a list of targets to run a PCA on each omics dataset via the `pcaMethods` package, and if necessary imputes missing values through NIPALS-PCA. The PCA results can be easily visualised for all or specific omics datasets:
Similarly, the `pca_complete_data_factory` generates a list of targets to run a PCA on each omics dataset via the [`pcaMethods` package](https://bioconductor.org/packages/release/bioc/html/pcaMethods.html), and if necessary imputes missing values through NIPALS-PCA. The PCA results can be easily visualised for all or specific omics datasets:

```{r pca-screeplot}
#| code-fold: true
Expand Down Expand Up @@ -169,7 +173,7 @@ More information about data pre-processing can be found in @sec-preprocessing.

## Data pre-filtering

The created `MultiDataSet` object can be filtered, both in terms of samples and features, by passing a list of sample or feature IDs to retain, or by using logical tests on samples or features metadata. In addition, we implement target factories to retain only the most variable features in each omics dataset --unsupervised filtering--, or to retain the features most associated with an outcome of interest, via sPLS-DA from `mixOmics` --supervised filtering-- (see @sec-prefiltering). This pre-filtering step is essential to reduce the size of the datasets prior to multi-omics integration.
The created `MultiDataSet` object can be filtered, both in terms of samples and features, by passing a list of sample or feature IDs to retain, or by using logical tests on samples or features metadata. In addition, we implement target factories to retain only the most variable features in each omics dataset --unsupervised filtering--, or to retain the features most associated with an outcome of interest, via sPLS-DA from [`mixOmics`](http://mixomics.org/) --supervised filtering-- (see @sec-prefiltering). This pre-filtering step is essential to reduce the size of the datasets prior to multi-omics integration.

<details>

Expand Down Expand Up @@ -201,7 +205,7 @@ mo_presel_supervised

## Multi-omics data integration

Currently, `moiraine` provides functions and target factories to facilitate the use of five integration methods: sPLS and DIABLO from the `mixOmics` package, sO2PLS from `OmicsPLS`, as well as `MOFA` and `MEFISTO` from `MOFA2`.
Currently, `moiraine` provides functions and target factories to facilitate the use of five integration methods: sPLS and DIABLO from the `mixOmics` package, sO2PLS from [`OmicsPLS`](https://cran.r-project.org/web/packages/OmicsPLS/index.html), as well as `MOFA` and `MEFISTO` from [`MOFA2`](https://biofam.github.io/MOFA2/).

This includes functions that transform a `MultiDataSet` object into the required input format for each integration method; for example for sPLS (only top of the matrices shown):

Expand Down

0 comments on commit 8e38f06

Please sign in to comment.