Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GATK Cohort Calling Stash Not Working #47

Open
imtiyazhariyani opened this issue Jul 16, 2018 · 4 comments
Open

GATK Cohort Calling Stash Not Working #47

imtiyazhariyani opened this issue Jul 16, 2018 · 4 comments

Comments

@imtiyazhariyani
Copy link

Tried running the GATK Cohort workflow from BioSAILS without running it per chromosome. The workflow executes and produces an s batch script but with no rules. Samples are found. However, the following line is displayed when I run the biox command:

"Path::Tiny paths require defined, positive-length parts at /scratch/gencore/.local/easybuild/software/gencore_dev/1.0/lib/perl5/site_perl/5.22.0/BioX/Workflow/Command/run/Rules/Directives/Types/Path.pm line 183."

Below is the yml script:

`---
global:
# Initial Directory Setup
- indir: "data/analysis_imtiyaz"
- outdir: "data/analysis_imtiyaz/cohort"
# indir/outdir is a chained variable

it gets changed within a rule

- root_in_dir: "data/analysis"
- root_out_dir: "data/analysis/cohort"
# Find Samples
- sample_glob: "data/analysis_imtiyaz/Sample*/gatk/*_haplotype.realigned.withrg.csorted.cleaned.aligned.vcf"
- by_sample_outdir: '1'
# Analysis Dirs
- combine_dir: "data/analysis_imtiyaz"
# Reference Data
- bwa_mem_reference: "/scratch/gencore/160713_SN7001341_0131_AC8YL9ACXX/Unaligned/Project_Boissinot_lab/data/analysis/reference/leptopelis_transcriptrinity"
- reference: "{$self->bwa_mem_reference}.fa"
# HPC Directives
- HPC:
   - account: 'ieh211'
   - partition: 'serial'
   - module:  'gencore gencore_dev gencore_variant_detection/1.0'
   - cpus_per_task: 1
   - commands_per_node: 1

rules:
- stash_samples:
local:
- override_process: 1
- create_outdir: 0
process: |-
{
use File::Glob;
use Cwd;
my @glob = glob(cwd().'/'. $self->sample_glob);
$self->stash->{sample_files} = @glob;
($SILENTLY);
}

- combine_gvcf:
    local:
            - override_process: 1
            - indir: "{$self->combine_dir}"
            - outdir: "{$self->combine_dir}"
            - OUTPUT: "{$self->outdir}/ALL_SAMPLES_haplotype.realigned.withrg.csorted.cleaned.aligned.vcf"
            - HPC:
               - deps: 'stash_samples'
               - walltime: '48:00:00'
               - mem: '40GB'
            - process_mustache: |
               gatk -Xmx80G -T CombineGVCFs \
               -R {{{reference}}} \
               {{#stash.sample_files}}
                 --variant {{{.}}} \
               {{/stash.sample_files}}
               -o {{{outdir}}}/ALL_SAMPLES_haplotype.realigned.withrg.csorted.cleaned.aligned.vcf
    process: |-
      {
        $OUT .= $self->render_mustache($self->process_mustache);
      }

- cohort_calling:
    local:
            - indir: "{$self->{combine_dir}"
            - outdir: "{$self->{combine_dir}"
            - INPUT: "{self->combine_dir}/ALL_SAMPLES_haplotype.realigned.withrg.csorted.cleaned.aligned.vcf"
            - OUTPUT: "{$self->combine_dir}/ALL_SAMPLES_cohort.haplotype.realigned.withrg.csorted.cleaned.aligned.vcf"		
            - process_mustache: |
                gatk -T GenotypeGVCFs \
                  -R {{{reference}}} \
                  -stand_call_conf '30' \
                  -o {{{outdir}}}/ALL_SAMPLES_cohort.haplotype.realigned.withrg.csorted.cleaned.aligned.vcf
            - HPC:
              - deps: 'combine_gvcf'
              - walltime: '48:00:00'
              - mem: '40GB'
    process: |-
      {
        $OUT .= $self->render_mustache($self->process_mustache);
      }

`

@jerowe
Copy link
Collaborator

jerowe commented Jul 16, 2018

@imtiyazhariyani , let me check this out. Its not immediately apparent whats going on.

@jerowe
Copy link
Collaborator

jerowe commented Jul 16, 2018

Fixed - there were a few issues with the cohort calling rule.

- cohort_calling:
    local:
             # Should be {$self->combine_dir}
            - indir: "{$self->{combine_dir}"
            - outdir: "{$self->{combine_dir}"
            # Should be {$self-> ... }
            - INPUT: "{self->combine_dir}/ALL_SAMPLES_haplotype.realigned.withrg.csorted.cleaned.aligned.vcf"
       	

Should be

      - cohort_calling:
          local:
              - indir: "{$self->combine_dir}"
              - outdir: "{$self->combine_dir}"
              - INPUT: "{$self->indir}/ALL_SAMPLES_haplotype.realigned.withrg.csorted.cleaned.aligned.vcf"

@jerowe
Copy link
Collaborator

jerowe commented Jul 16, 2018

Here's the whole thing.

---
global:
    # Initial Directory Setup
    - indir: "data/analysis_imtiyaz"
    - outdir: "data/analysis_imtiyaz/cohort"
    # indir/outdir is a chained variable
    - root_in_dir: "data/analysis"
    - root_out_dir: "data/analysis/cohort"
    # Find Samples
    - sample_glob: "data/analysis_imtiyaz/Sample*/gatk/*_haplotype.realigned.withrg.csorted.cleaned.aligned.vcf"
    - by_sample_outdir: '1'
    # Analysis Dirs
    - combine_dir: "data/analysis_imtiyaz"
    # Reference Data
    - bwa_mem_reference: "/scratch/gencore/160713_SN7001341_0131_AC8YL9ACXX/Unaligned/Project_Boissinot_lab/data/analysis/reference/leptopelis_transcriptrinity"
    - reference: "{$self->bwa_mem_reference}.fa"
    # HPC Directives
    - HPC:
      - account: 'ieh211'
      - partition: 'serial'
      - module:  'gencore gencore_dev gencore_variant_detection/1.0'
      - cpus_per_task: 1
      - commands_per_node: 1
rules:
      - stash_samples:
          local:
              - override_process: 1
              - create_outdir: 0
          process: |-
              {
                use File::Glob;
                use Cwd;
                my @glob = glob(cwd().'/'. $self->sample_glob);
                $self->stash->{sample_files} = \@glob;
                ($SILENTLY);
              }
      - combine_gvcf:
          local:
              - override_process: 1
              - indir: "{$self->combine_dir}"
              - outdir: "{$self->combine_dir}"
              - OUTPUT: "{$self->outdir}/ALL_SAMPLES_haplotype.realigned.withrg.csorted.cleaned.aligned.vcf"
              - HPC:
                   - deps: 'stash_samples'
                   - walltime: '48:00:00'
                   - mem: '40GB'
              - process_mustache: |
                 gatk -Xmx80G -T CombineGVCFs \
                 -R {{{reference}}} \
                 {{#stash.sample_files}}
                   --variant {{{.}}} \
                 {{/stash.sample_files}}
                 -o {{{outdir}}}/ALL_SAMPLES_haplotype.realigned.withrg.csorted.cleaned.aligned.vcf
          process: |-
              {
                $OUT .= $self->render_mustache($self->process_mustache);
              }
      - cohort_calling:
          local:
              - indir: "{$self->combine_dir}"
              - outdir: "{$self->combine_dir}"
              - INPUT: "{$self->indir}/ALL_SAMPLES_haplotype.realigned.withrg.csorted.cleaned.aligned.vcf"
              - OUTPUT: "{$self->outdir}/ALL_SAMPLES_cohort.haplotype.realigned.withrg.csorted.cleaned.aligned.vcf"
              - process_mustache: |
                  gatk -T GenotypeGVCFs \
                    -R {{{reference}}} \
                    -stand_call_conf '30' \
                    -o {{{outdir}}}/ALL_SAMPLES _cohort.haplotype.realigned.withrg.csorted.cleaned.aligned.vcf
              - HPC:
                - deps: 'combine_gvcf'
                - walltime: '48:00:00'
                - mem: '40GB'
          process: |-
            {
              $OUT .= $self->render_mustache($self->process_mustache);
            }

@imtiyazhariyani
Copy link
Author

Oops that was fairly straightforward! It works now, thank you @jerowe

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants