Use a different MAE data for more meaningful MOFA results #321

artur-sannikov · 2023-08-04T09:35:19Z

Is your feature request related to a problem? Please describe.
At the moment, Hintikka data is used for basically all analysis in OMA book. However, I see a problem in the MOFA section because the model only finds one factor which explains the variability only in metabolomic data (see "Variance Explained per factor and assay" figures). So I have difficulties interpreting and discussing the results because it does not show much in my opinion.

In contrast, in the original MOFA+ paper, they found that the factors capture different pieces of information, for example the differences in methylation, classes of neurons, etc. The presence of these factors also allowed them to apply t-SNE to discover sub-populations of cell types. Well, in our case, we cannot do much of downstream analysis.

Describe the solution you'd like
I see two solutions here:

Use some other multi-omic data from a different resource, build a MAE object (or find already existing data in MAE format) and show how we can perform downstream analysis on MOFA factors;
Add a MAE object directly to mia (and use that for MOFA and downstream analysis), which might be more complicated but at the same it should become easier to work in the future.

These two solutions can be implemented simultaneously, and I do not have any preference to either as long as the data provides us with meaningful and interpretable results.

Additional context

MAE package has a built-in multi-assay experiment miniACC as an example;
If we use some other data, it'll break the flow of the analysis which at the moment uses CCA to uncover some interesting relationships and then MOFA to confirm and expand the previous findings;
The dataset, of course, should be related to microbiome, although most of available multi-omic datasets come from cancer research (i.e., RNA-seq, methylation, mutations, etc.)

antagomir · 2023-08-04T10:18:26Z

We can certainly add another MAE demo data set in mia, for instance. It should be about microbiome research (which is indeed so far less covered in terms of multiomics methodology than cancer studies).

Or we can use existing data set. The possible sources:

borenstein-lab/microbiome-metabolome-curated-data/; does not support TreeSE/MAE as such so that would require additional work/code.
curatedMetagenomicData; provides a list of TreeSEs for cases where multiomics is available but the MAE support is still under consideration, so our own code should convert the experiment list into MAE
EBI MGnify API through MGnifyR pkg; I think this already provides outputs readily in MAE format. This is a central data resource for European microbiome research and open data sharing, I think that would be quite good source if a suitable data set can be identified.

I expect that more informative factors can be identified from data sets with larger sample sizes.

artur-sannikov changed the title ~~Use a different MAE data for more meaninful MOFA results~~ Use a different MAE data for more meaningful MOFA results Aug 4, 2023

TuomasBorman added this to miaverse finalization Oct 5, 2023

antagomir mentioned this issue Apr 22, 2024

MAE support for MOFA #277

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use a different MAE data for more meaningful MOFA results #321

Use a different MAE data for more meaningful MOFA results #321

artur-sannikov commented Aug 4, 2023

antagomir commented Aug 4, 2023 •

edited

Loading

Use a different MAE data for more meaningful MOFA results #321

Use a different MAE data for more meaningful MOFA results #321

Comments

artur-sannikov commented Aug 4, 2023

antagomir commented Aug 4, 2023 • edited Loading

antagomir commented Aug 4, 2023 •

edited

Loading