Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add somatic SV calling to somatic workflow #101

Open
wants to merge 4 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
391 changes: 315 additions & 76 deletions docs/germline-inputs.md

Large diffs are not rendered by default.

10 changes: 9 additions & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ MultiQC), adapter clipping (using cutadapt), mapping (using BWA mem or
bwakit) and variant calling (based on the
[GATK Best Practice](https://software.broadinstitute.org/gatk/best-practices/)
for germline calling, and using a variety of callers for somatic calling).
Optionally, the somatic workflow can also perform CNV calling.
Optionally, the somatic workflow can also perform CNV calling and SV calling.

This workflow is part of [BioWDL](https://biowdl.github.io/)
developed by the SASC team
Expand Down Expand Up @@ -316,6 +316,9 @@ This workflow will produce a number of directories and files:
It contains the vcf file for the sample if germline.wdl was used and
a single sample vcf was produced.
It also contains a directory per readgroup.
- **structural-variants**: Structural variant calling results per
caller and the merged results from SURVIVOR. Only present if
`germline.wdl` is used with SV calling enabled.
- **CNVcalling**: Contains the CNV calling results for this sample
and its control sample. Only present if `somatic.wdl` is used with
CNV calling enabled.
Expand All @@ -329,6 +332,11 @@ This workflow will produce a number of directories and files:
- **PON**: A generated panel of normals and the preprocessed intervals.
Only present if `somatic.wdl` is used with CNV calling enabled and no
PON or preprocessed intervals were provided in the inputs.
- **somatic-sv-calling**: Somatic SV calling results. Only present if
`somatic.wdl` is used with SV calling enabled.
- **<sample>**: SV results for Delly and Manta per tumor sample.
- **gridss**: GRIDSS results per normal and the generated PON files
(if no PON was provided in the inputs).

## Scattering
This workflow performs scattering to speed up analysis on grid computing
Expand Down
457 changes: 412 additions & 45 deletions docs/somatic-inputs.md

Large diffs are not rendered by default.

2 changes: 2 additions & 0 deletions germline.wdl
Original file line number Diff line number Diff line change
Expand Up @@ -309,8 +309,10 @@ workflow Germline {
cleverVcfs: {description: ""}
matecleverVcfs: {description: ""}
mantaVcfs: {description: ""}
smooveVcfs: {description: ""}
dellyVcfs: {description: ""}
survivorVcfs: {description: ""}
gridssVcfs: {description: ""}
gridssVcfIndexes: {description: ""}
SVunionVcfs: {description: ""}
SVisecVcfs: {description: ""}
Expand Down
2 changes: 1 addition & 1 deletion scripts
Submodule scripts updated 1 files
+8 −8 docs_template.md.j2
30 changes: 30 additions & 0 deletions somatic.wdl
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ import "gatk-CNVcalling/CNV-PON.wdl" as cnvPon
import "sample.wdl" as sampleWorkflow
import "somatic-variantcalling/somatic-variantcalling.wdl" as somaticVariantcallingWorkflow
import "structs.wdl" as structs
import "structural-variantcalling/somatic.wdl" as somaticSvCalling
import "tasks/biowdl.wdl" as biowdl
import "tasks/bwa.wdl" as bwa
import "tasks/common.wdl" as common
Expand All @@ -45,6 +46,7 @@ workflow Somatic {
Boolean umiDeduplication = false
Boolean collectUmiStats = false
Boolean performCnvCalling = false
Boolean performSvCalling = false
String platform = "illumina"
Boolean useBwaKit = false
Boolean runStrelka = true
Expand Down Expand Up @@ -194,7 +196,17 @@ workflow Somatic {
sample = select_first([sample.control])
}

# Collect the inputs for SV calling, so only paired samples are included.
String tumorIdsForSvCalling = sample.id
File tumorBamsForSvCalling = sampleWorkflow.recalibratedBam[casePosition.position]
File tumorBamIndexesForSvCalling = sampleWorkflow.recalibratedBamIndex[casePosition.position]
String controlIdsForSvCalling = select_first([sample.control])
File controlBamsForSvCalling = sampleWorkflow.recalibratedBam[controlPostition.position]
File controlBamIndexesForSvCalling = sampleWorkflow.recalibratedBamIndex[controlPostition.position]
Pair[String, String] tumorControlPairsForSvCalling = (sample.id, select_first([sample.control]))
}

# Allow SNV calling on tumor-only samples as well.
Int controlPos = select_first([controlPostition.position, 0])
String? controlSample = if (defined(sample.control)) then sampleIds[controlPos] else DONOTDEFINETHISSTRING
File? controlBam = if (defined(sample.control)) then sampleWorkflow.recalibratedBam[controlPos] else DONOTDEFINETHISFILE
Expand Down Expand Up @@ -245,6 +257,23 @@ workflow Somatic {
}
}

if (performSvCalling) {
call somaticSvCalling.SomaticSvCalling as SVs {
input:
normalIds = select_all(controlIdsForSvCalling),
normalBams = select_all(controlBamsForSvCalling),
normalBamIndexes = select_all(controlBamIndexesForSvCalling),
tumorIds = select_all(tumorIdsForSvCalling),
tumorBams = select_all(tumorBamsForSvCalling),
tumorBamIndexes = select_all(tumorBamIndexesForSvCalling),
pairs = select_all(tumorControlPairsForSvCalling),
referenceFasta = refFasta,
referenceFastaFai = refFastaFai,
bwaIndex = bwidx,
outputDir = outputDir + "/somatic-sv-calling"
}
}

call multiqc.MultiQC as multiqcTask {
input:
reports = flatten(sampleWorkflow.reports),
Expand Down Expand Up @@ -322,6 +351,7 @@ workflow Somatic {
dbsnpVCF: {description: "dbsnp VCF file used for checking known sites.", category: "required"}
dbsnpVCFIndex: {description: "Index (.tbi) file for the dbsnp VCF. Will be created automatically if not present.", category: "common"}
performCnvCalling: {description: "Whether or not CNV calling should be performed.", category: "common"}
performSvCalling: {description: "Whether or not SV calling should be performed.", category: "common"}
platform: {description: "The platform used for sequencing.", category: "advanced"}
useBwaKit: {description: "Whether or not BWA kit should be used. If false BWA mem will be used.", category: "advanced"}
runStrelka: {description: "Whether or not to run Strelka.", category: "common"}
Expand Down