-
Notifications
You must be signed in to change notification settings - Fork 28
How to run YAMP on a HPC
This tutorial explains how to run YAMP on high-performance computing (HPC) facilities.
To run on different (HPC or not) systems, YAMP takes advantage of the Nextflow framework architecture, and, specifically of its executor. Briefly, a Nextflow executor is a component that specifies on which system YAMP is run, and orchestrates the YAMP execution.
For instance, in the rosalind
profile (Rosalind is King's College London's HPC, more on profiles here), the executor is set to:
process.executor = 'slurm'
but Nextflow supports multiple executors, among others:
You can find more information on the supported schedulers on the Nextflow documetation
To run YAMP locally, you should not specify any executor
.
Usually, schedulers offer users multiple queues to which the jobs can be submitted to. For instance, they can have queues dedicated to short or long jobs, or to jobs that require low or high memory. YAMP takes advantage of the Nextflow queue parameter to specify the HPC queue(s) to be used.
For instance, in the rosalind
profile, the queue is set to:
process.queue = 'brc'
To run on your system, you should simply specify the same of your queue(s), for instance:
process.queue = 'highmem,long-highmem'
You can find more information on the queue
directive on the Nextflow documetation.
Nextflow makes available a number of other directives that allow allocating the correct resources on the local or remote system, and that are explained here:
Please note that the processes' specifications you will find in the ./conf/base.config
file (that is, time, CPU and memory), have been optimised using our in-house metagenomic dataset which is composed of about 2000 faecal samples with very different data quality and thus very different requirements. These values may require some tuning, but we are confident that they will cover most of the users' scenarios.
Getting started
Tips and Tricks
Tutorials