-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Raremetal filter out all the variants #31
Comments
Is it possible that you are mixing b37 and b38 datasets when meta-analyzing? Or mixing chromosome names (chr1 vs 1) between datasets? |
Hi Jonathon, Thanks for your quick reply. All my data are in b38 and use chr* format for chromosomes. If this is the problem won't also reflect in the single variant association? Thanks Luca |
So to clarify, when you say that "single variant association" works and gene-level does not, do you mean that My earlier suspicion was that one of the studies in What are the contents of |
Yes, that score file has the path to the files
similarly cov_file
The variants for /test_dir/score_file overlap with the one in group file. for instance, the variant in
the variants in the group file for that gene are:
|
I see. The names of those score and cov files suggest that you have run Does this make sense? Or am I misunderstanding what's going on? |
Thanks for pointing that out! I changed the workflow. Now, raremetalworker is performing one association with all my variants of interest producing only one score and cov file. However, raremetal still have the same problem. Any further suggestion? |
I'm not sure that Raremetal will run for a single study, i.e., you would need more than one score file and more than one cov file. I suspect that Raremetal is not the tool you should be using. Raremetal is meant for meta-analysis of multiple studies that either have study-specific covariates and/or consent constraints that prevent the studies from being analyzed together. I'm not sure exactly what you are trying to achieve (polygenic analysis, gene-wise analysis accounting for hidden relatedness, etc.), but there are likely other tools better suited for your needs. |
I'd like to perform a burden test. Collapse all the effect sizes of variants that are within the same gene and have a 'cumulative' effect size. Isn't that possible with raremetal? The problem is that is filtering out all the variants. What are the reasons for raremetal to filter out variants? |
Raremetal will do burden tests if you have multiple datasets (sets of individuals), but it sounds like you only have one. You should look into EPACTS (https://genome.sph.umich.edu/wiki/EPACTS#Gene-wise_or_group-wise_tests) if your sample size is modest or if your sample size is large but you don't want to account for hidden relatedness. Otherwise, SAIGE-GENE (https://github.com/weizhouUMICH/SAIGE) will do burden tests for sample sizes in the hundreds of thousands and account for hidden relatedness. |
I'll look into these other softwares, thank you. Do you know why variants are filtered out then? |
What happens if you run without |
Dear Ryan, Thanks for your suggestion. without
The log file is similar, all the variants are filtered out and it ends with:
|
For the variants that are filtered out when running the burden test (such as chr1:173903903:G:T), do they have a non-missing score stat and p-value in the single variant file produced by raremetalworker? Also, as Jonathon mentions, there are other options for single study rare variant tests, such as SAIGE-GENE, GMMAT, and GENESIS. I think SAIGE-GENE might be the only one that supports very unbalanced case/control ratios (?) |
Hi Ryan, Yes, majority of the variants get score and p-value in raremetalworker, then they are filtered anyway by raremetal. For instance: output raremetalworker score
Raremetal log file
|
Huh, I'm at a loss there. Is your data confidential? Could you maybe send me the data for one of those genes? That way we can run it in the debugger and see what's going on. |
Hi Ryan, It looks like that removing I have a few another question, possibly, easier to solve now that we know where is the problem: When I use
however, the 6th columns of the ped file is:
I can use calculateOR.pl, but then would these values be used in the burden test? Can I swap the columns to let raremetal use the OR rather than effect sizes? Best, Luca |
Does I would be cautious using raremetal/raremetalworker/src/FastFit.h Lines 71 to 73 in 2c82cfc
So my guess is the original authors didn't intend for it to be used. My understanding of But unfortunately there is no written explanation that I can find on the wiki or in the script. I believe the burden test only looks at the score statistic, covariance, and possibly allele frequencies (weights). So it will likely not take into account your updated effect size estimates. Actually the effects are just calculated from the score stat and variance, and written out to the file. At the end of the day, for a binary trait, probably best to use software designed (and tested) to correctly handle them. |
Dear raremetal developers,
I really appreciate the work you are doing to develop and maintain raremetal. I am reaching out because I need some help to understand raremetal behaviour.
Recently, I am trying to run rare metal on a set of variants (~50) on 3 genes. The cohort has about 130,000 individuals. I am using the commands I pasted below. The single variants association runs ok, and I get the score and cov files. When I run raremetal burden and skat (but I also tried the other weighted approaches), all the variants are filtered out; see log file below.
Do you have any suggestion on how to fix this behaviour?
Any suggestion is much appreciated,
Best,
Luca
PED
DAT
VCF
Commands
and
Raremetalworker output score
Raremetalworker output cov
Group file
log output raremetal
The text was updated successfully, but these errors were encountered: