-
Notifications
You must be signed in to change notification settings - Fork 57
Single Sample SVs
Sample commands:
run_cnvkit -> cnvkit_main (docker: etal/cnvkit) /usr/bin/python /usr/local/bin/cnvkit.py batch -r <cnvkit_reference_cnn> --method wgs
run_cnvkit -> cns_to_vcf (docker: etal/cnvkit) /usr/bin/python /usr/local/bin/cnvkit.py call <cnvkit_main output cns> -o adjusted.tumor.cns && /usr/bin/python /usr/local/bin/cnvkit.py export vcf adjusted.tumor.cns --cnr <cnvkit_main output cnr> -o cnvkit.vcf
run_manta (docker: mgibio/manta_somatic-cwl) /usr/bin/python /usr/bin/manta/bin/configManta.py --referenceFasta --tumorBam --runDir && /usr/bin/python runWorkflow.py -m local -j 12
run_smoove (docker: brentp/smoove) /usr/local/bin/smoove call --processes 4 -F --genotype --name SV --fasta --exclude <smoove_exclude_regions>
INSERT LINK TO DOCKER IMAGES/REPOS AND CWL
INSERT PROCESS DIAGRAM
Name | Description | Example | Required |
---|---|---|---|
bam | Aligned sequencing results to be analyzed for SVs | ✓ | |
cnvkit_diagram | Create an ideogram of copy ratios on chromosomes as a pdf | false | |
cnvkit_drop_low_coverage | Helps avoid false positive deletions in low quality tumor samples | false | |
cnvkit_male_reference | Use/assume a male reference | false | |
cnvkit_method | Sequencing protocol used | wgs | |
cnvkit_reference_cnn | A copy number reference file against which potential copy number variants will be evaluated | /gscmnt/gc2560/core/cnvkit_pon/v1/reference.cnn | ✓ |
cnvkit_scatter_plot | Create a whole genome copy ratio profile as a pdf scatter plot | false | |
cnvkit_vcf_name | Custom name to use for the cnvkit output vcf | cnvkit_output | |
manta_call_regions | bgzip-compressed, tabix-indexed BED file specifiying regions to which variant analysis will be restricted | ||
manta_non_wgs | When true, activates settings appropriate for whole exome sequencing | false | |
manta_output_contigs | if true, outputs assembled contig sequences in final VCF files, in the INFO field CONTIG | true | |
maximum_sv_pop_freq | Population frequency above which variants will be filtered out | ||
merge_estimate_sv_distance | When evaluating variants to be merged, estimate distance based on the size of the sv | true | ✓ |
merge_max_distance | Maximum distance of variants to consider for merging | 1000 | ✓ |
merge_min_sv_size | Minimum size of SVs to merge | 1 | ✓ |
merge_min_svs | Minimum number of sv calls needed to be merged | 1 | ✓ |
merge_same_strand | Require merged SVs to be on the same strand | true | ✓ |
merge_same_type | Require merged SVs to be of the same type | true | ✓ |
merge_sv_pop_freq_db | bed file containing allele frequencies for a population | /gscmnt/gc2560/core/cwl/inputs/hall_lab_B38_SV_public_callset/sv.bedpe.gz | ✓ |
reference | Reference sequence | example_data/exome_workflow/chr17_test.fa | ✓ |
smoove_exclude_regions | Regions to be ignored when calling SVs through smoove (a wrapper for lumpy) | ||
sv_filter_interval_lists | One or more interval lists defining regions to keep in the output vcf, labeled with the source of the intervals | /gscmnt/gc2560/core/model_data/interval-list/db8c25932fd94d2a8a073a2e20449878/a35b64d628b94df194040032d53b5616.interval_list, /gscmnt/gc2560/core/model_data/interval-list/1eea27120d294db49826cef2e79b618c/3a61ffd42f074fe1b8a20742f6dfb32e.interval_list, /gscmnt/gc2560/core/model_data/interval-list/86494a288c3c4d7a89842ed2f1d6e36a/f54639200d364231bd5e1c39266ccfac.interval_list | ✓ |
variants_to_table_fields | one or more of any standard VCF column (CHROM, ID, QUAL) or any binding in the INFO field (e.g., AC=10) to add to the tsv report | ||
variants_to_table_genotype_fields | one or more of any binding in the FORMAT field (e.g., GQ, PL) to add to the tsv report | ||
vep_cache_dir | Location of a local ensembl cache to be used by vep | example_data/exome_workflow/ | ✓ |
vep_ensembl_assembly | Which (species) assembly vep should use | GRCh38 | ✓ |
vep_ensembl_species | Which species vep should use | homo_sapiens | ✓ |
vep_ensembl_version | Which ensembl release vep should use | 95 | ✓ |
vep_to_table_fields | VEP CSQ annotation fields to add to the tsv report |
Name | Source | Description |
---|---|---|
annotated_tsvs | GATK VariantsToTable | tsv files containing specified SV fields and annotations |
cn_diagram | CNVkit | ideogram of copy ratios on chromosomes |
cn_scatter_plot | CNVkit | whole genome copy ratio profile |
cnvkit_vcf | CNVkit | final cnvkit output, converted to vcf format |
filtered_vcfs | Various filters | SV VCF, filtered by variant population frequency and the above interval lists |
manta_all_candidates | Manta | Unscored SV and indel candidates |
manta_diploid_variants | Manta | SVs and indels scored and genotyped under a diploid model |
manta_small_candidates | Manta | simple insertion and deletion variants less than the minimum scored variant size (50 by default) |
manta_somatic_variants | Manta | SVs and indels scored under a somatic variant model |
manta_tumor_only_variants | Manta | Subset of the candidateSV.vcf.gz file after removing redundant candidates and small indels less than the minimum scored variant size (50 by default) |
merged_annotated_svs | Suvivor, VEP | SV calls from Manta, CNVkit, and Smoove(lumpy), merged by Survivor and annotated by VEP |
smoove_output_variants | Smoove (Lumpy) | SV calls from Smoove, a wrapper for Lumpy |
sv_pop_filtered_vcf | Various filters | SV VCF, filtered by variant population frequency |
tumor_antitarget_coverage | CNVkit | Coverage in the antitarget regions from bam read depths |
tumor_bin_level_ratios | CNVkit | table of copy number ratios |
tumor_segmented_ratios | CNVkit | discrete copy number segments from the above table |
tumor_target_coverage | CNVkit | Coverage in the target regions from bam read depths |