-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updating workflow files to work with Igenome #388
Changes from 4 commits
4adf3d5
10b88c6
c5e9123
cc697dc
0573675
7d534cd
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -43,18 +43,21 @@ workflow NFCORE_SMRNASEQ { | |
take: | ||
ch_input // channel: samplesheet file as specified to --input | ||
ch_samplesheet // channel: sample fastqs parsed from --input | ||
fasta // params.fasta | ||
mirtrace_species // params.mirtrace_species | ||
bowtie_index // params.bowtie_index | ||
ch_versions // channel: [ path(versions.yml) ] | ||
|
||
main: | ||
//Config checks | ||
// Check optional parameters | ||
if (!params.mirgenedb && !params.mirtrace_species) { | ||
if (!params.mirgenedb && !mirtrace_species) { | ||
exit 1, "Reference species for miRTrace is not defined via the --mirtrace_species parameter." | ||
} | ||
|
||
// Genome options | ||
def mirna_gtf_from_species = params.mirtrace_species ? (params.mirtrace_species == 'hsa' ? "https://github.com/nf-core/test-datasets/raw/smrnaseq/miRBase/hsa.gff3" : "https://mirbase.org/download/CURRENT/genomes/${params.mirtrace_species}.gff3") : false | ||
def mirna_gtf = params.mirna_gtf ?: mirna_gtf_from_species | ||
mirna_gtf_from_species = mirtrace_species ? (mirtrace_species == 'hsa' ? "https://github.com/nf-core/test-datasets/raw/smrnaseq/miRBase/hsa.gff3" : "https://mirbase.org/download/CURRENT/genomes/${mirtrace_species}.gff3") : false | ||
mirna_gtf = params.mirna_gtf ?: mirna_gtf_from_species | ||
|
||
if (!params.mirgenedb) { | ||
if (params.mature) { reference_mature = file(params.mature, checkIfExists: true) } else { exit 1, "Mature miRNA fasta file not found: ${params.mature}" } | ||
|
@@ -108,12 +111,12 @@ workflow NFCORE_SMRNASEQ { | |
) | ||
ch_versions = ch_versions.mix(FASTQ_FASTQC_UMITOOLS_FASTP.out.versions) | ||
|
||
ch_fasta = params.fasta ? file(params.fasta): [] | ||
ch_fasta = fasta ? file(fasta): [] | ||
ch_reads_for_mirna = FASTQ_FASTQC_UMITOOLS_FASTP.out.reads | ||
|
||
// even if bowtie index is specified, there still needs to be a fasta. | ||
// without fasta, no genome analysis. | ||
if(params.fasta) { | ||
if(fasta) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps since we are changing this for all fasta, we could adhere to adding the ch_ prefix:
https://nf-co.re/docs/guidelines/components/subworkflows If this breaks things or you think it is too complicated we can open a new issue and address it separately. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I originally tackled this issue by converting value variables to value channels, and converting file string variables to path channels, but it created a lot of issues downstream. I want to tackle the conversion of all of these variables into channels in #390. To make sure that these are the only changes implemented in the branch There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this fasta variable is just equal to params.fasta, so I will convert it to val_fasta |
||
//Prepare bowtie index, unless specified | ||
//This needs to be done here as the index is used by GENOME_QUANT | ||
if(params.bowtie_index) { | ||
|
@@ -124,7 +127,7 @@ workflow NFCORE_SMRNASEQ { | |
} else { | ||
Channel.fromPath("${params.bowtie_index}**ebwt", checkIfExists: true).ifEmpty{ error "Bowtie1 index directory not found: ${params.bowtie_index}" }.filter { it != null }.set { ch_bowtie_index } | ||
} | ||
} else { | ||
} else { | ||
INDEX_GENOME ( [ [:], ch_fasta ] ) | ||
ch_versions = ch_versions.mix(INDEX_GENOME.out.versions) | ||
ch_bowtie_index = INDEX_GENOME.out.index | ||
|
@@ -181,8 +184,8 @@ workflow NFCORE_SMRNASEQ { | |
// | ||
// SUBWORKFLOW: MIRTRACE | ||
// | ||
if (params.mirtrace_species) { | ||
MIRTRACE(ch_mirtrace_inputs) | ||
if (mirtrace_species) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. same comment as before, perhaps adding |
||
MIRTRACE(ch_mirtrace_inputs, mirtrace_species) | ||
ch_versions = ch_versions.mix(MIRTRACE.out.versions) | ||
} else { | ||
log.warn "The parameter --mirtrace_species is absent. MIRTRACE quantification skipped." | ||
|
@@ -209,20 +212,21 @@ workflow NFCORE_SMRNASEQ { | |
ch_reads_for_mirna = CONTAMINANT_FILTER.out.filtered_reads | ||
|
||
} | ||
|
||
//MIRNA_QUANT process should still run even if mirtrace_species is null when mirgendb is true | ||
MIRNA_QUANT ( | ||
[ [:], reference_mature], | ||
[ [:], reference_hairpin], | ||
mirna_gtf, | ||
ch_reads_for_mirna | ||
[ [:], reference_mature], | ||
[ [:], reference_hairpin], | ||
mirna_gtf, | ||
ch_reads_for_mirna, | ||
mirtrace_species | ||
) | ||
ch_versions = ch_versions.mix(MIRNA_QUANT.out.versions) | ||
|
||
// | ||
// GENOME | ||
// | ||
genome_stats = Channel.empty() | ||
if (params.fasta){ | ||
if (fasta){ | ||
GENOME_QUANT ( ch_bowtie_index, ch_fasta, MIRNA_QUANT.out.unmapped ) | ||
genome_stats = GENOME_QUANT.out.stats | ||
ch_versions = ch_versions.mix(GENOME_QUANT.out.versions) | ||
|
@@ -306,7 +310,7 @@ workflow NFCORE_SMRNASEQ { | |
ch_multiqc_files = ch_multiqc_files.mix(MIRNA_QUANT.out.mature_stats.collect({it[1]}).ifEmpty([])) | ||
ch_multiqc_files = ch_multiqc_files.mix(MIRNA_QUANT.out.hairpin_stats.collect({it[1]}).ifEmpty([])) | ||
ch_multiqc_files = ch_multiqc_files.mix(MIRNA_QUANT.out.mirtop_logs.collect().ifEmpty([])) | ||
if (params.mirtrace_species) { | ||
if (mirtrace_species) { | ||
ch_multiqc_files = ch_multiqc_files.mix(MIRTRACE.out.results.collect().ifEmpty([])) | ||
} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. I thought these had to be defined in the
PIPELINE_INITIALISATION
and the outputs fromPIPELINE_INITIALISATION
have to be passed to the main workflow. See:https://github.com/nf-core/taxprofiler/blob/5d3ee5513a84f92773c8376c55b5f4da39835307/main.nf#L74-L91
or
https://github.com/nf-core/phaseimpute/blob/f2823b024800155cd87d70f406794324c4123dfd/main.nf#L140-L151
but I see sarek and rnaseq have different approaches as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll leave it as it is, as we discussed, since there doesn't seem to be a standard for how to set up the new template, we can always change it later if we have a reason to do so