Skip to content

SV calling with Sniffles

Luis Paulin edited this page Jul 21, 2023 · 2 revisions

Sniffles2 is a fast structural variant caller for long-read sequencing, Sniffles accurately detect SVs on germline, somatic and population-level for PacBio and Oxford Nanopore read data.

Germline SV calling

  • To call germline SV with Sniffles only a indexed bam file is needed. In order to output deletion (DEL SV) sequences, the reference genome (.fasta) must be specified using e.g. --reference reference.fasta.
  • Sniffles2 is fully parallelized and uses 4 threads by default. This value can be adapted using e.g. --threads 4 as option. Memory requirements will increase with the number of threads used.
  • To also write the binary SNF file, the --snf option is required.
  • To output read names in SNF and VCF files, the --output-rnames option is required.

Example

sniffles --input sample.bam --vcf sample_germline_sv.vcf.gz --reference reference.fasta --snf sample.snf

Population SV calling

  • To call population SV with Sniffles we need to first call SV for each individual file including the --snf option in order to also write the binary SNF file.
  • Here again, to output deletion (DEL SV) sequences, the reference genome (.fasta) must be specified using e.g. --reference reference.fasta.
  • Next we used all the SNF files to merge perform a joint SV calling to produce a fully-genotyped population VCF

Example

# Calling
sniffles --input sample1.bam --vcf sample1.vcf.gz --reference reference.fasta --snf sample1.snf
sniffles --input sample2.bam --vcf sample2.vcf.gz --reference reference.fasta --snf sample2.snf
sniffles --input sample3.bam --vcf sample3.vcf.gz --reference reference.fasta --snf sample3.snf

# Population merge/ joint calling
sniffles --input sample1.snf sample2.snf sample3.snf --vcf population.vcf.gz

Mosaic/low-frequency SV calling

  • To call mosaic SV with Sniffles we need to use the --mosaic flag.
  • Mosaic calling reports SVs with variant allele frequencies (VAF) between 5 and 20%.
  • Same as before, to output deletion (DEL SV) sequences, the reference genome (.fasta) must be specified using e.g. --reference reference.fasta.
  • Here, we can also output the binary SNF file (--snf option is required).

Example

# Mosaic SV calling
sniffles --input sample.bam --vcf sample_mosaic_sv.vcf.gz --reference reference.fasta --mosaic