Spacing out processes by user-defined time #2079
-
Hi, I'd like to run several Blast searches remotely, but accordingly to the NCBI guidelines https://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=BlastDocs&DOC_TYPE=DeveloperInfo, one shouldn't contact the server more often than once every 10 seconds. Is there a way to space out instances of the same process by a given (ideally, user-defined) time? E.g. 1st Blast search launched at XX:00, 2nd at XX:10, 3rd at XX:20, etc. If not, would it be possible to add this option to Nextflow as a new feature? Many thanks, |
Beta Was this translation helpful? Give feedback.
Replies: 6 comments 1 reply
-
Not sure to understand your use case. |
Beta Was this translation helpful? Give feedback.
-
Okay, let me try to clarify. Let's say I have N FASTA files containing 1 sequence each. I want to make a remote Blast search using the same Nextflow process (e.g. "searchNCBI", or however you want to call it) so that each sequence query can run in parallel and save time. However, the NCBI server doesn't accept queries launched with less than 10 seconds between each other (see my previous link). This is a fair usage policy towards other users and, if you don't follow it, your searches will be moved to a slower queue or blocked. Now, is there a way to make Nextflow launch the instances of the process "searchNCBI" with a interval of 10sec or more between them? If not, the alternatives would be running a single multi-FASTA search for my N sequences, or N individual searches in a for loop with a Hope this clarifies my question and use case. |
Beta Was this translation helpful? Give feedback.
-
If I understand correctly, you could achieve something along these lines with beforeScript and maxForks:
This would run at-most one process at a time, and wait 10 seconds before each one, guaranteeing that you are not hitting the server more than once every 10 seconds. |
Beta Was this translation helpful? Give feedback.
-
Thanks Brandon. Unfortunately, the strategy you suggested would wait until the current instance of that process is complete before firing the next instance (but you can only see that if you use a command that takes longer than 10 seconds, rather than a very quick I guess this is not yet possible in Nextflow, but would be a nice option to have, in order to avoid overloading public servers with too many concurrent requests. Cheers, |
Beta Was this translation helpful? Give feedback.
-
Try adding the following setting in the
Find details here https://www.nextflow.io/docs/latest/config.html?highlight=rate%20limit#scope-executor |
Beta Was this translation helpful? Give feedback.
-
Thank you Paolo, that's exactly what I was looking for! I didn't know this executor scope setting could also accept fractions, very convenient. |
Beta Was this translation helpful? Give feedback.
Try adding the following setting in the
nextflow.config
fileFind details here https://www.nextflow.io/docs/latest/config.html?highlight=rate%20limit#scope-executor