Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiple cohorts from the same tissue #102

Open
arkyl opened this issue Oct 19, 2021 · 5 comments
Open

multiple cohorts from the same tissue #102

arkyl opened this issue Oct 19, 2021 · 5 comments

Comments

@arkyl
Copy link

arkyl commented Oct 19, 2021

Hi,
Thanks much for the software.
I am wondering in case of multiple cohorts from the same tissue, what the best practice is. For example, 3 cohorts from tissue A, 2 cohorts from tissue B, and 1 cohort from tissue C: should I input 6 cohorts results to mash or should I do some combining work (such as meta analysis to get single result in each tissue) to input 3 tissue results to mash?

My other question is that the sample size may vary a lot from tissue A (e.g ~1000) to tissue C (e.g ~50). Would that be a problem for mash?

Thanks a lot for your advice!

Yue

@gaow
Copy link
Member

gaow commented Oct 19, 2021

@arkyl Good question -- are the cohorts of the same population (ideally if you could run their genotypes through eg PCA and tell from the PCs?)

@arkyl
Copy link
Author

arkyl commented Oct 19, 2021

Thanks for the quick reply. The cohorts are all from european descent. So I guess they can be regarded as the same population. The initial results from the different cohorts within the same tissue are indeed very similar, which is expected.

@gaow
Copy link
Member

gaow commented Oct 19, 2021

Assuming there are no overlapping samples in these cohorts, if you perform fixed effect meta-analysis to merge the cohorts for each tissue, it would be the same as forcing the correlations between those cohorts to be 1 in a mash model. Not sure how others think of this (comments welcomed!), but I would probably perform meta-analysis first for each tissue to force it into using a reasonable model. The interpretation down the road might also be simpler , eg. you can make statements about sharing across tissues, not cohort+tissue combinations.

@gaow
Copy link
Member

gaow commented Oct 19, 2021

My other question is that the sample size may vary a lot from tissue A (e.g ~1000) to tissue C (e.g ~50). Would that be a problem for mash?

So your z-scores in tissue C are expected to be smaller than that in tissue A, but the effect size estimate may be of a similar scale -- standard error of smaller samples will be larger, thus smaller z-scores. This may relevant to choosing between EE and EZ model (alpha parameter in documentation for details) in mash. We generally suggest trying both and use the one model that results in a larger likelihood.

@arkyl
Copy link
Author

arkyl commented Oct 19, 2021

Thanks a lot for detailed explanation and suggestions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants