Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

help in running BERMUDA #1

Open
nmalwinka opened this issue May 23, 2019 · 5 comments
Open

help in running BERMUDA #1

nmalwinka opened this issue May 23, 2019 · 5 comments

Comments

@nmalwinka
Copy link

Hi, would you be able to add example script how to connect pre-processing in R and follow up with autoencoder in Python please?

@nmalwinka nmalwinka reopened this May 23, 2019
@txWang
Copy link
Owner

txWang commented May 25, 2019

Hi,

We used two packages in R and saved the results as .csv file in order to run BERMUDA. You could follow the preprocessing steps in BERMUDA/R/pre_processing.R
First, we used Seurat to find highly variable genes and cluster cells for each batch (e.g. BERMUDA/pancreas/muraro_seurat.csv).
Then, we used MetaNeighbor to generate a similarity matrix between clusters of different batches (e.g. BERMUDA/pancreas/pancreas_metaneighbor.csv).
Once you have the required .csv files, you could run BERMUDA directly (e.g. BERMUDA/main_pancreas.py).
Hope this is helpful.

Best,
Tongxin

@nmalwinka
Copy link
Author

nmalwinka commented Jun 24, 2019

hi again, my dataset is quite big and I run out of memory, getting error: Error: cannot allocate vector of size 656.7 Gb Execution halted
the metaneighbor package from Maggie Crow has some updated code to avoid vectorising (https://github.com/gillislab/MetaNeighbor/blob/master/R/MetaNeighborUS.R see MetaNeighborUSLowMem). Have you tried to upgrade your code to allow bigger datasets to run using Bermuda?

@nmalwinka
Copy link
Author

I managed to figure it out by myself. I have a problem with result though. After loading code_list and producing code I expected it to be the same array size as data but it isn't:

>>> code.shape
(51687, 20)
>>> data.shape
(51687, 2583)

There are the same number of cells, but I have only 20 genes(?) there instead of 2583 variable genes.

Further question is how to transform this back to Seurat object?
Many thanks

@txWang
Copy link
Owner

txWang commented Jul 12, 2019

Hi,

Thank you for your question. Similar to many batch correction methods, BERMUDA removes batch effects by projecting the original data to a low dimensional space (dimensionality equals to 20 here). The low dimensional code does not suffer from batch effects and can be used for further analysis such as visualization.
Currently, we do not support the transformation between our results and Seurat objects.

Best,
Tongxin

@yzcv
Copy link

yzcv commented Nov 12, 2019

Hi,

I am quite attracted by your BERMUDA work, but I have a problem in running the "pre_processing.R" in the BERMUDA/R folder. I am wondering if you could provide the two datasets, namely "muraro_human.csv" and "baron_human.csv", which are required in the code "pre_processing.R". Thank you in advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants