You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
At the moment, Hintikka data is used for basically all analysis in OMA book. However, I see a problem in the MOFA section because the model only finds one factor which explains the variability only in metabolomic data (see "Variance Explained per factor and assay" figures). So I have difficulties interpreting and discussing the results because it does not show much in my opinion.
In contrast, in the original MOFA+ paper, they found that the factors capture different pieces of information, for example the differences in methylation, classes of neurons, etc. The presence of these factors also allowed them to apply t-SNE to discover sub-populations of cell types. Well, in our case, we cannot do much of downstream analysis.
Describe the solution you'd like
I see two solutions here:
Use some other multi-omic data from a different resource, build a MAE object (or find already existing data in MAE format) and show how we can perform downstream analysis on MOFA factors;
Add a MAE object directly to mia (and use that for MOFA and downstream analysis), which might be more complicated but at the same it should become easier to work in the future.
These two solutions can be implemented simultaneously, and I do not have any preference to either as long as the data provides us with meaningful and interpretable results.
Additional context
MAE package has a built-in multi-assay experiment miniACC as an example;
If we use some other data, it'll break the flow of the analysis which at the moment uses CCA to uncover some interesting relationships and then MOFA to confirm and expand the previous findings;
The dataset, of course, should be related to microbiome, although most of available multi-omic datasets come from cancer research (i.e., RNA-seq, methylation, mutations, etc.)
The text was updated successfully, but these errors were encountered:
We can certainly add another MAE demo data set in mia, for instance. It should be about microbiome research (which is indeed so far less covered in terms of multiomics methodology than cancer studies).
Or we can use existing data set. The possible sources:
curatedMetagenomicData; provides a list of TreeSEs for cases where multiomics is available but the MAE support is still under consideration, so our own code should convert the experiment list into MAE
EBI MGnify API through MGnifyR pkg; I think this already provides outputs readily in MAE format. This is a central data resource for European microbiome research and open data sharing, I think that would be quite good source if a suitable data set can be identified.
I expect that more informative factors can be identified from data sets with larger sample sizes.
artur-sannikov
changed the title
Use a different MAE data for more meaninful MOFA results
Use a different MAE data for more meaningful MOFA results
Aug 4, 2023
Is your feature request related to a problem? Please describe.
At the moment, Hintikka data is used for basically all analysis in OMA book. However, I see a problem in the MOFA section because the model only finds one factor which explains the variability only in metabolomic data (see "Variance Explained per factor and assay" figures). So I have difficulties interpreting and discussing the results because it does not show much in my opinion.
In contrast, in the original MOFA+ paper, they found that the factors capture different pieces of information, for example the differences in methylation, classes of neurons, etc. The presence of these factors also allowed them to apply t-SNE to discover sub-populations of cell types. Well, in our case, we cannot do much of downstream analysis.
Describe the solution you'd like
I see two solutions here:
These two solutions can be implemented simultaneously, and I do not have any preference to either as long as the data provides us with meaningful and interpretable results.
Additional context
The text was updated successfully, but these errors were encountered: