- Built up bit by bit.
- Utilise nextflow training best practices.
- Initially start with a few files on local directory, move to AWS batch.
- Document and tag every step
- Let's start with a template that get stuff from, but it not exactly the same as, an nf-core template.
- In particular, I'd like to use a few of the base configuration files in nf-core, so we can direct different resources to different processes depending on the labels.
- This should run with
nextflow run main.nf --input data/samplesheet.csv --outdir results
- The result should be just the channel containing the example files
Decided to see if a FASTQC_TRIMGALORE subworkflow (from cutandrun) would work.
This seems to work, and makes for another good tag point.
If you check out this tag you should be able to run with nextflow run main.nf --input data/samplesheet.csv --outdir results
- This version should place the trimmed results and fastqc results into the outdir.
- This version records software versions.
I still needed to build the genome index (both botwie2 and faidx, so I cobbled together a PREPARE_GENOME subworkflow based on one of the nf-core pipelines. Not sure why the bowtie2/build module from nf-core uses process-high, so I changed it to process-medium.
Should be runnable with
nextflow run main.nf --input data/samplesheet.csv --outdir results/ --fasta data/genome/ecoli_rel606.fasta
This runs bowtie and samtools. It took a bit of work to realise that the bowtie index needed to be a value Channel, not a queue channel, so I could re-use it.
Should be runnable with
nextflow run main.nf --input data/samplesheet.csv --outdir results/ --fasta data/genome/ecoli_rel606.fasta
The final VCF files should go into results/03_vcf/vcf.
I used a slightly different method for the vinal varian calling, but the basic idea is the same.
The final pipeline can be run with bash s3-run.sh
. Please change your group name before running.