Skip to content

Commit

Permalink
Adding regions file to WES mk_examples command to speed up run time
Browse files Browse the repository at this point in the history
  • Loading branch information
skchronicles committed Nov 14, 2024
1 parent ebca284 commit ea59355
Show file tree
Hide file tree
Showing 2 changed files with 18 additions and 8 deletions.
11 changes: 7 additions & 4 deletions workflow/rules/germline.smk
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ rule deepvariant_make_examples:
input:
bam = join(workpath, "BAM", "{name}.sorted.bam"),
bai = join(workpath, "BAM", "{name}.sorted.bam.bai"),
bed = provided(join(workpath, "references", "wes_regions_50bp_padded.bed"), run_wes),
output:
success = join(workpath, "deepvariant", "mk_examples", "{name}.make_examples.success"),
params:
Expand All @@ -46,7 +47,11 @@ rule deepvariant_make_examples:
w.name,
int(allocated("threads", "deepvariant_make_examples", cluster))
)),
message: "Running DeepVariant make_examples on '{input.bam}' input file"
# Call variants within regions BED
# file created from WES capture kit
wes_region_option = lambda _: "--regions {0}".format(
join(workpath, "references", "wes_regions_50bp_padded.bed"),
) if run_wes else '',
threads: int(allocated("threads", "deepvariant_make_examples", cluster))
container: config['images']['deepvariant']
envmodules: config['tools']['deepvariant']
Expand Down Expand Up @@ -74,7 +79,7 @@ rule deepvariant_make_examples:
--halt 2 \\
--line-buffer \\
make_examples \\
--mode calling \\
--mode calling {params.wes_region_option} \\
--ref {params.genome} \\
--reads {input.bam} \\
--examples {params.example} \\
Expand Down Expand Up @@ -131,7 +136,6 @@ rule deepvariant_call_variants:
# @WES = "/opt/models/wes/model.ckpt"
# @WGS = "/opt/models/wgs/model.ckpt"
ckpt = lambda _: "/opt/models/wes/model.ckpt" if run_wes else "/opt/models/wgs/model.ckpt",
message: "Running DeepVariant call_variants on '{wildcards.name}' sample"
threads: int(allocated("threads", "deepvariant_call_variants", cluster))
container: config['images']['deepvariant']
envmodules: config['tools']['deepvariant']
Expand Down Expand Up @@ -197,7 +201,6 @@ rule deepvariant_postprocess_variants:
w.name,
int(allocated("threads", "deepvariant_make_examples", cluster))
)),
message: "Running DeepVariant postprocess_variants on '{input.callvar}' input file"
threads: int(allocated("threads", "deepvariant_postprocess_variants", cluster))
container: config['images']['deepvariant']
envmodules: config['tools']['deepvariant']
Expand Down
15 changes: 11 additions & 4 deletions workflow/rules/somatic.smk
Original file line number Diff line number Diff line change
Expand Up @@ -630,8 +630,9 @@ rule hmftools_sage:
data using the same set of options for WGS. At the current moment, sage
does not have an option to restrict variant calling to specific regions.
It does have an -high_depth_mode option; however, the authors state it
should only be used for small targeted panels. For more information
about hmftools visit github:
should only be used for small targeted panels. In the 'somatic_selectvar'
rule, any variants outside the padded regions/capture-kit BED file are
removed in WES data. For more information about hmftools visit github:
https://github.com/hartwigmedical/hmftools
@Input:
Sorted BAM file (scatter-per-tumor-sample)
Expand Down Expand Up @@ -796,6 +797,11 @@ rule deepsomatic_make_examples:
# @WES = "/opt/models/deepsomatic/wes"
# @WGS = "/opt/models/deepsomatic/wgs"
ckpt = lambda _: "/opt/models/deepsomatic/wes" if run_wes else "/opt/models/deepsomatic/wgs",
# Call variants within regions BED
# file created from WES capture kit
wes_region_option = lambda _: "--regions {0}".format(
join(workpath, "references", "wes_regions_50bp_padded.bed"),
) if run_wes else '',
# Get tumor and normal sample names
tumor = '{name}',
# Building option for the paired normal sorted bam
Expand Down Expand Up @@ -848,7 +854,7 @@ rule deepsomatic_make_examples:
--reads_tumor {input.tumor} {params.normal_bam_option} \\
--sample_name_tumor {params.tumor} {params.normal_name_option} \\
--examples {params.example} \\
--checkpoint "{params.ckpt}" \\
--checkpoint "{params.ckpt}" {params.wes_region_option} \\
--vsc_max_fraction_indels_for_non_target_sample "0.5" \\
--vsc_max_fraction_snps_for_non_target_sample "0.5" \\
--vsc_min_fraction_indels "0.05" \\
Expand Down Expand Up @@ -1337,7 +1343,8 @@ rule somatic_selectvar:
somatic callers. This step takes the somatic calls from all the callers
(assumes already re-headered if needed, i.e. strelka and muse), and then
runs bcftools norm to split multi-allelic sites AND gatk SelectVariants
to filter sites.
to filter sites. For WES data, this step will also remove any variants
that are outside the padded regions/capture-kit BED file.
@Input:
Per sample, per caller, VCF somatic variants
@Output:
Expand Down

0 comments on commit ea59355

Please sign in to comment.