Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation for a large cohort #23

Open
lee039 opened this issue Feb 3, 2021 · 2 comments
Open

Implementation for a large cohort #23

lee039 opened this issue Feb 3, 2021 · 2 comments

Comments

@lee039
Copy link

lee039 commented Feb 3, 2021

Dear authors/developers,

Previously I made a short script to make violin plots to visualize average depth for all Ref/Ref, Ref/Alt, and Alt/Alt samples, for each SV sites (aggregating Duphold annotated values). I did not count the exact numbers, but I often encountered ambiguous violin plots (i.e. hom_del showing depth ranging 0.1~0.9).

It occurs to me SV-plaudit is much smarter in that it really shows the raw data.
Thanks a lot for sharing this amazing tool. I hope SV-plaudit can save a lot of time that I spend on manual IGV inspection.

So, I am trying to figure out the feasibility of SV-plaudit in my analyses.

  • I have a cohort of ~300 samples at ~30X coverage.
  • I called SVs using Smoove and discovered 44K sites (including 20K BND sites),

Reading your paper and some issues I see the following as restrictions

  • very long event will not work (one user attempted 227 Mb duplication; not worked). So what would be the maximum length?
  • BND (Breakends) visualization is not supported; this leaves me 24K sites to evaluate
  • Does this support a cohort level visualization (visualization of ~300 samples)? I don't think it would be fun to do evaluated based on 300 samples, though... I read the Atlantic salmon SV paper (Bertolotti et al) and there, 3 samples (Ref/Ref, Ref/Alt, Alt/Alt) were selected for SV-plaudit inspection. Should 3 samples be enough? or would you make SV-plaudit plots for, for example, 2 times of 3 samples (maybe even better)?

In the GigaScience paper (Belyeu et al 2018), the SV-Plaudit performance was compared to SVtyper and CNVnator.
In the results section, it was explained: "SVTYPER called 30.7% of the deletions that were unanimously GOOD as homozygous reference".
What I was curious was that should SV-plaudit deemed as a quick-scanning tool to ensure that predicted SV sites are real SVs? It would not necessarily lead to accurate genotyping, as it is done by SVtyper. Am I right?

One last question: could you please explain what was meant by this, in the discussion of the Gigascience paper? : "While this second level of analysis is crucial, it is beyond the scope of this paper, and we argue this analysis be performed only for those SVs that are fully supported by the alignment data."
I was unsure what was meant by "SVs that are supported by the alignment data". SVs are called based on alignment evidence, so are all SVs predicted are supported by alignment? Maybe I got something severely wrong?

Thank you very much in advance!

Lim

@ryanlayer
Copy link
Collaborator

ryanlayer commented Feb 3, 2021 via email

@lee039
Copy link
Author

lee039 commented Feb 8, 2021

Hi,

I wrote you back to your e-mail "'[email protected]'".
Could you please check your mailbox?

Lim

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants