Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vcf report export is short of records compared to tsv and excel #296

Open
clinicalngs opened this issue Sep 30, 2024 · 1 comment
Open

Comments

@clinicalngs
Copy link

Trying to extract a vcf out of an annotated file (single sample). The number of entries is smaller than the ones in gui (excel, tsv are correct).
This is true when reports are obtained from gui as well as from command line.
No filtering was applied at this point.

The objective is to extract certain variants from priority genes to pass the vcf to a different annotator, which imposes data size limitations.
Procedure was:

  1. Annotate a wgs or wes with many annotators, including clinvar & gnomad3 in windows (oc v. 2.8)
  2. Shrink sqlite with a filter (clinvar-no benign; gnomad <0.01; set of priority genes) in command line using oc util.
  3. Reimport smaller sqlite in oc gui and verify number of variants. Also, open smaller sqlite in DB Browser (a viewer for sqlite) and verify selected variants. All OK so far.
  4. Export vcf, tsv, excel from gui and from command line. Both procedures give the same results: VCF is missing some variants (~20%) compared to those in the smaller sqlite and in tsv and excel exports. No pattern of exclusion can be observed - low and high quality, SNP and InDels, all chromosomes, same genes - all are equaly or randomly selected in or out.

Is there any reason that exporting in vcf format may filter or miss certain variants? Perhaps I did something wrong.
I would like to obtain the same number of variants as in the smaller filtered sqlite file. Ideally I would also like to remove all previous annotators to further shrink the data file.
Thanks

@jasminebro
Copy link
Collaborator

Hi @clinicalngs our IT team could not recreate your issue. Are you able to share your VCF file with us via email ([email protected]) so we can better troubleshoot the issue?

Thank you for using OpenCRAVAT!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants