-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clustermq multiprocess for main jobs and ssh for worker jobs #198
Comments
Maybe Just thinking out loud |
This is a rare edge case, and trying to build something directly into For your use case, it seems like what we really want is heterogeneous transient workers, a problem best suited for the |
For sure this is an edge case. I didn't think targets should change to accommodate this but was hoping to discuss if its possible with existing infrastructure. This actually works, but it's a bit hacky. I had to set the library(targets)
tar_script({
tmpl <- list(
job_name = "name",
partition = "partition",
node = "node"
)
tar_option_set(
resources = tmpl,
deployment = "worker",
storage = "main",
retrieval = "main"
)
tar_pipeline(
tar_target(id, 1:5),
tar_target(
foo,
Sys.sleep(id),
pattern = map(id)
),
tar_target(
bar,
Sys.sleep(id),
pattern = map(id)
)
)
}
)
# Run certain targets as multiprocess
withr::with_options(
list(clustermq.scheduler = "multiprocess"),
tar_make_clustermq(
names = foo,
workers = 2L,
callr_function = NULL
)
)
# Run others as HPC via SSH
withr::with_options(
list(clustermq.scheduler = "ssh", clustermq.ssh.host = "USER@DOMAIN"),
tar_make_clustermq(
names = bar,
workers = 2L,
callr_function = NULL
)
) |
Seems like that should work (even with the default |
Just thought of a super simple way to get around this in # _targets.R:
library(targets)
future::plan(future::sequential)
tar_pipeline(
tar_target(x, seq_len(4)),
tar_target(
y,
Sys.sleep(30),
pattern = map(x),
resources = list(plan = future.callr::callr)
)
) # R console:
library(targets)
tar_make_future(workers = 4) The catch here is that it won't actually help you with |
Not sure why multisession futures don't work this way though. I will post details to another thread. |
Cool, thanks for sharing. I haven't really ever taken the time to develop a good mental model for Related sets of question:
Obviously my rare use-case fits in, but wondering about whether it could be more generally useful.
|
Admittedly, I only use it for parallel processing on non-lazy transient workers. It does go a lot deeper in terms of the abstraction and asynchronicity.
I'm not sure what you mean.
I wouldn't count on global options with the |
Hi Will,
I haven't thought this through well enough to really assess its feasibility, but I wanted to scribble my thoughts down and get your input. Not sure if we need to loop @mschubert in or not.
As you know, for one of my current projects I am using the "mostly local, sometimes remote" approach - my project lives on my local machine, but some computationally intensive tasks are selectively sent to the HPC via SSH thanks to
clustermq
. This works great.However, when using
options(clustermq.scheduler = "ssh")
, you have only two options, run jobs locally and sequentially in the"main"
process, or send the job viassh
. The majority of the tasks run in the"main"
R
process and are forced to run sequentially, all for the ability to send a few select jobs to HPC.So long story short, I am wondering if would somehow be possible to use
"multiprocess"
for jobs withdeployment = "main"
and"ssh"
for targets withdeployment = "worker"
. I know this is a convoluted use-case, but I am actually constrained to using this workflow for this particular project and was just wondering if something like that could possibly work.Reasons why I don't just run everything via
ssh
:Some of the tasks are trivial and quick, and the overhead of sending them to the HPC over sockets is unnecessary
Some of the targets rely on the local NFS for access to files which cannot be moved to the cluster or cloud
Reasons why I don't just run everything locally:
Memory constraints on my local machine
There are fewer computationally intensive tasks, but they take days to run sometimes
The text was updated successfully, but these errors were encountered: