Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sniffles2 output #537

Open
annecmg opened this issue Jan 13, 2025 · 3 comments
Open

Sniffles2 output #537

annecmg opened this issue Jan 13, 2025 · 3 comments

Comments

@annecmg
Copy link

annecmg commented Jan 13, 2025

Hello, I am new to structural variant calling and I am looking for a bit more information on how to interpret the Sniffles2 output. I have used multi-sample calling on a haploid genomes and now I am unsure how to interpret in which samples an SV was found, e.g.,

Fol007_1 7177 Sniffles2.INS.0M0 N TGGTTCATAGACTTTTTGTTTGTTTGTTTGTTTGTTTGTTTGTATATTTTTCGTCTGGTGCTGTAACGAAGCCAGATCTCTACTGGAGCAAAAAAAGGACGTGCTGAGCTAATAGGTACAGGGAAACCCCCGCTTCCTTCACCATCCAGCATACCAATGGCACAAAGTGCTGTATTTCTCAAGCCCGTCACCCGTCAGCGGGTTCAGGGGCAATCTATTGTCGAACTTTTTCCACCGACTGCAGAATTCTCTCGCGCATCACTACATCTGCCGACGCAGTCCACAATCTCCAATGGCAAGTTGTAGCTCGTAGTCGACAATGTCGCGGCGGCAGGGAAAAACCTCCGGGAAGAATCAGAGGCCGACGGTGCGGCTTGCCTTCAAGGAGAAAAAGCCCATCGCAATCACGGCGTCGTTGATCAGGGGATGTCGCCACCGATGCCATGGGCATCGTCCTCCGATATCTAGGCGTTCTGGTTCTGAGGTGCCGCCATTAGATCCGACAAGGTGTTGTCGAAGGCAGACTCTCCCAGAATTCCTTCTCTGAAAACTGCTGTTGGGGAACCGTGTATCCGCAGTTTAATGTGTCGCGTGACTGAAAAATCCACTCCTAACACGGCACTAACAGTGCGTGCCAAGAAAAGAATTTTTTACGGGGAGGAGTTG 53 PASS PRECISE;SVTYPE=INS;SVLEN=670;END=7177;SUPPORT=3;COVERAGE=14,14,14,15,15;STRAND=+-;AC=2;STDEV_LEN=11.719;STDEV_POS=8.083;SUPP_VEC=001000000001100000000000000000000000000 GT:GQ:DR:DV:ID ./.:0:3:0:NULL ./.:0:1:0:NULL 0/0:45:20:1:Sniffles2.INS.0S0 0/0:0:18:0:NULL 0/0:0:10:0:NULL 0/0:0:10:0:NULL 0/0:0:8:0:NULL 0/0:0:5:0:NULL 0/0:0:11:0:NULL 0/0:0:10:0:NULL ./.:0:2:0:NULL 1/1:6:1:6:Sniffles2.INS.0S0 0/0:29:14:1:Sniffles2.INS.0S0 ./.:0:2:0:NULL ./.:0:4:0:NULL ./.:0:3:0:NULL 0/0:0:11:0:NULL ./.:0:0:0:NULL ./.:0:0:0:NULL 0/0:0:24:0:NULL 0/0:0:19:0:NULL 0/0:0:22:0:NULL 0/0:0:18:0:NULL 0/0:0:14:0:NULL ./.:0:2:0:NULL ./.:0:4:0:NULL 0/0:0:5:0:NULL 0/0:0:10:0:NULL ./.:0:0:0:NULL ./.:0:1:0:NULL 0/0:0:22:0:NULL 0/0:0:22:0:NULL ./.:0:4:0:NULL 0/0:0:23:0:NULL 0/0:0:5:0:NULL 0/0:0:6:0:NULL ./.:0:0:0:NULL ./.:0:0:0:NULL ./.:0:0:0:NULL

According to SUPP_VEC this is found in the 3rd sample, but that is genotyped as 0/0. Although there is only 1 reference read, it corresponds to an insertion in the original call (0/0:45:20:1:Sniffles2.INS.0S0). Should this be interpreted as having the SV present or shall only the ones genotyped as 0/1 or 1/1 be included?
Thanks a lot, Anne

@fritzsedlazeck
Copy link
Owner

Dear Anne,
so I know this is confusing.. we are using 0/0 but there might be evidence on the read level. In your case, I am assuming you sequenced a population of individuals? So the read ration should be correlated with heterogenity in the population.... or of course could be also artifacts. So 0/0 doesnt always mean totally absent. The SUP_VEC indicates a 1 if a single read can be found and assigned to the SV.

hope that helps a bit ??
Bis bald :)
Fritz

@annecmg
Copy link
Author

annecmg commented Jan 13, 2025

Thanks for the quick reply! As I understand one read would already be evidence, especially when the SV was also found in other samples? The reads come indeed from a population!

@fritzsedlazeck
Copy link
Owner

Correct. The merge step rescues other samples that might have too little evidence.. so if there is one passing call in one sample.. the others might get rescued.
Cheers
Fritz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants