forked from USCbiostats/slurmR
-
Notifications
You must be signed in to change notification settings - Fork 0
/
README.Rmd
310 lines (221 loc) · 12.2 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
---
output:
github_document:
md_extensions: -auto_identifiers+header_attributes
---
[![DOI](http://joss.theoj.org/papers/10.21105/joss.01493/status.svg)](https://doi.org/10.21105/joss.01493)
[![Travis build status](https://travis-ci.org/USCbiostats/slurmR.svg?branch=master)](https://travis-ci.org/USCbiostats/slurmR)
[![codecov](https://codecov.io/gh/USCbiostats/slurmR/branch/master/graph/badge.svg)](https://codecov.io/gh/USCbiostats/slurmR)
[![CRAN status](https://www.r-pkg.org/badges/version/slurmR)](https://CRAN.R-project.org/package=slurmR)
[![CRAN downloads](http://cranlogs.r-pkg.org/badges/grand-total/slurmR)](https://cran.r-project.org/package=slurmR)
[![status](https://tinyverse.netlify.com/badge/slurmR)](https://CRAN.R-project.org/package=slurmR)
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r setup, include = FALSE}
options(width = 80)
knitr::opts_chunk$set(
collapse = TRUE,
comment = "# ",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# slurmR: A Lightweight Wrapper for Slurm <img src="man/figures/logo.png" height="180px" align="right"/>
Slurm Workload Manager is a popular HPC cluster job scheduler found in many of the top 500 super computers. The `slurmR` R package provides an R wrapper to it that matches the parallel package's syntax, this is, just like `parallel` provides the `parLapply`, `clusterMap`, `parSapply`, etc., `slurmR` provides `Slurm_lapply`, `Slurm_Map`, `Slurm_sapply`, etc.
While there are other alternatives such as `future.batchtools`, `batchtools`, `clustermq`, and `rslurm`, this R package has the following goals:
1. It is dependency free, which means that it works out-of-the-box
2. Puts an emphasis on been similar to the workflow in the R package `parallel`
3. It provides a general framework for the user to create its own wrappers without using template files.
4. Is specialized on Slurm, meaning more flexibility (no need to modify template files), and, in the future, better debugging tools (e.g. job resubmission).
5. Provide a backend for the
[parallel](https://CRAN.R-project.org/view=HighPerformanceComputing)
package, providing an out-of-the-box method for creating Socket cluster objects
for multi-node operations. (See the examples below on how this can be
used with other R packages)
Checkout the [VS section](#vs) section for comparing `slurmR` with other R packages.
Wondering who is using Slurm? Checkout the [list at the end of this document](#who-uses-slurm).
## Installation
From your HPC command line, you can install the development version from [GitHub](https://github.com/) with:
```bash
$ git clone https://github.com/USCbiostats/slurmR.git
$ R CMD INSTALL slurmR/
```
The second line assumes you have R available in your system (usually loaded via
`module R` or some other command). Or using the `devtools` from within R:
``` r
# install.packages("devtools")
devtools::install_github("USCbiostats/slurmR")
```
## Citation
```{r cite, echo=FALSE, comment=""}
citation("slurmR")
```
## Example 1: Computing means (and looking under the hood)
```{r simple-example}
library(slurmR)
# Suppose that we have 100 vectors of length 50 ~ Unif(0,1)
set.seed(881)
x <- replicate(100, runif(50), simplify = FALSE)
```
We can use the function `Slurm_lapply` to distribute computations
```{r example1}
ans <- Slurm_lapply(x, mean, plan = "none")
Slurm_clean(ans) # Cleaning after you
```
Notice the `plan = "none"` option, this tells `Slurm_lapply` to only create the job object, but do nothing with it, i.e., skip submission. To get more info, we can actually set the verbose mode on
```{r example1-with-verb}
opts_slurmR$verbose_on()
ans <- Slurm_lapply(x, mean, plan = "none")
Slurm_clean(ans) # Cleaning after you
```
## Example 2: Job resubmission
The following example from the package's manual.
```r
# Submitting a simple job
job <- Slurm_EvalQ(slurmR::WhoAmI(), njobs = 20, plan = "submit")
# Checking the status of the job (we can simply print)
job
status(job) # or use the state function
sacct(job) # or get more info with the sactt wrapper.
# Suppose some of the jobs are taking too long to complete (say 1, 2, and 15 through 20)
# we can stop it and resubmit the job as follows:
scancel(job)
# Resubmitting only
sbatch(job, array = "1,2,15-20") # A new jobid will be assigned
# Once its done, we can collect all the results at once
res <- Slurm_collect(job)
# And clean up if we don't need to use it again
Slurm_clean(res)
```
Take a look at the vignette [here](vignettes/getting-started.Rmd).
## Example 3: Using slurmR and future/doParallel/boot/...
The function `makeSlurmCluster` creates a PSOCK cluster within a Slurm HPC network,
meaning that users can go beyond a single node cluster object and take advantage
of Slurm to create a multi-node cluster object. This feature allows then using
`slurmR` with other R packages that support working with `SOCKcluster` class
objects. Here are some examples
With the [`future`](https://cran.r-project.org/package=future) package
```r
library(future)
library(slurmR)
cl <- makeSlurmCluster(50)
# It only takes using a cluster plan!
plan(cluster, cl)
...your fancy futuristic code...
# Slurm Clusters are stopped in the same way any cluster object is
stopCluster(cl)
```
With the [`doParallel`](https://cran.r-project.org/package=doParallel) package
```r
library(doParallel)
library(slurmR)
cl <- makeSlurmCluster(50)
registerDoParallel(cl)
m <- matrix(rnorm(9), 3, 3)
foreach(i=1:nrow(m), .combine=rbind)
stopCluster(cl)
```
## Example 4: Using slurmR directly from the command line
The `slurmR` package has a couple of convenient functions designed for the user
to save time. First, the function `sourceSlurm()` allows skipping the explicit
creating of a bash script file to be used together with `sbatch` by putting all
the required config files on the first lines of an R scripts, for example:
```{r, results='asis', echo=FALSE}
cat("```\n")
cat(readLines(system.file("example.R", package="slurmR")), sep="\n")
cat("```\n")
```
Is an R script that on the first line coincides with that of a bash script for
Slurm: `#!/bin/bash`. The following lines start with `#SBATCH` explicitly
specifying options for `sbatch`, and the reminder lines are just R code.
The previous R script is included in the package (type `system.file("example.R", package="slurmR")`).
Imagine that that R script is named `example.R`, then you use the `sourceSlurm`
function to submit it to Slurm as follows:
```r
slurmR::sourceSlurm("example.R")
```
This will create the corresponding bash file required to be used with `sbatch`,
and submit it to Slurm.
Another nice tool is the `slurmr_cmd()`. This function will create a simple bash
script that can be used as a command line tool to submit this type of R scripts.
Moreover, this command will can add the command to your session's [**alias**](https://en.wikipedia.org/wiki/Alias_(command)) as follows:
```r
library(slurmR)
slurmr_cmd("~", add_alias = TRUE)
```
Once that's done, you can simply submit R scripts with "Slurm-like headers" (as
shown previously) as follows:
```bash
$ slurmr example.R
```
## Example 5: Using the preamble
Since version 0.4-3, `slurmR` includes the option `preamble`. This provides a way
for the user to specify commands/modules that need to be executed before running
the Rscript. Here is an example using `module load`:
```{r preamble, warning=FALSE}
# Turning the verbose mode off
opts_slurmR$verbose_off()
# Setting the preamble can be done globally
opts_slurmR$set_preamble("module load gcc/6.0")
# Or on the fly
ans <- Slurm_lapply(1:10, mean, plan = "none", preamble = "module load pandoc")
# Printing out the bashfile
cat(readLines(ans$bashfile), sep = "\n")
Slurm_clean(ans) # Cleaning after you
```
## VS
There are several ways to enhance R for HPC. Depending on what are your goals/restrictions/preferences, you can use any of the following from this **manually curated** list:
```{r vs-table, echo = FALSE}
dat <- read.csv("comparing-projects.csv", check.names = FALSE)
dat$Dependencies <- sprintf("[![status](https://tinyverse.netlify.com/badge/%s)](https://CRAN.R-project.org/package=%1$s)", dat$Package)
dat$Activity <- sprintf("[![Activity](https://img.shields.io/github/last-commit/%s)](https://github.com/%1$s)", dat$github)
dat$Package <- sprintf("[**%s**](https://cran.r-project.org/package=%1$s)", dat$Package)
# Packages that only work with Slurm
only_w_slurm <- dat$Package[dat$`System [blank]` == "specific"]
only_w_slurm <- paste(only_w_slurm, collapse = ", ")
dat$github <- NULL
dat$`System [blank]` <- NULL
dat$`Focus on [blank]` <- NULL
knitr::kable(dat)
```
(1) After errors, the part or the entire job can be resubmitted.
(2) Functionality similar to the apply family in base R, e.g. lapply, sapply, mapply or similar.
(3) Creating a cluster object using either MPI or Socket connection.
The packages `r only_w_slurm` work only on Slurm. The [**drake**](https://cran.r-project.org/package=drake) package is focused on workflows.
## Contributing
We welcome contributions to `slurmR`. Whether it is reporting a bug, starting a discussion by asking a question, or proposing/requesting a new feature, please go by creating a new issue [here](https://github.com/USCbiostats/slurmR/issues) so that we can talk about it.
Please note that this project is released with a Contributor Code of Conduct (see
the CODE_OF_CONDUCT.md file included in this project). By participating in this
project you agree to abide by its terms.
## Who uses Slurm
Here is a manually curated list of institutions using Slurm:
|Institution | Country | Link |
|------------|---------|------|
| USC High Performance Computing Center | US | [link](https://hpcc.usc.edu) |
| Princeton Research Computing | US | [link](https://researchcomputing.princeton.edu/education/online-tutorials/getting-started/introducing-slurm) |
| Harvard FAS | US | [link](https://www.rc.fas.harvard.edu/resources/quickstart-guide/)|
| Harvard HMS research computing | US | [link](https://rc.hms.harvard.edu/) |
| UCSan Diego WM Keck Lab for Integrated Biology | US | [link](https://keck2.ucsd.edu/dokuwiki/doku.php/wiki:slurm) |
| Stanford Sherlock | US | [link](https://www.sherlock.stanford.edu/docs/overview/introduction/) |
| Stanford SCG Informatics Cluster | US | [link](https://login.scg.stanford.edu/tutorials/job_scripts/) |
| Berkeley Research IT | US | [link](http://research-it.berkeley.edu/services/high-performance-computing/running-your-jobs) |
| University of Utah CHPC | US | [link](https://www.chpc.utah.edu/documentation/software/slurm.php) |
| University of Michigan Biostatistics cluster| US | [link](https://sph.umich.edu/biostat/computing/cluster/slurm.html) |
| The University of Kansas Center for Research Computing | US | [link](https://crc.ku.edu/hpc/how-to) |
| University of Cambridge MRC Biostatistics Unit | UK | [link](https://www.mrc-bsu.cam.ac.uk/research-and-development/high-performance-computing-at-the-bsu/) |
| Indiana University | US | [link](https://kb.iu.edu/d/awrz) |
| Caltech HPC Center | US | [link](https://www.hpc.caltech.edu/documentation/slurm-commands) |
| Institute for Advanced Study | US | [link](https://www.sns.ias.edu/computing/slurm) |
| UTSouthwestern Medical Center BioHPC | US | [link](https://portal.biohpc.swmed.edu/content/guides/slurm/) |
| Vanderbilt University ACCRE | US | [link](https://www.vanderbilt.edu/accre/documentation/slurm/) |
| University of Virginia Research Computing | US | [link](https://www.rc.virginia.edu/userinfo/rivanna/slurm/) |
| Center for Advanced Computing | CA | [link](https://cac.queensu.ca/wiki/index.php/SLURM) |
| SciNet | CA | [link](https://docs.scinet.utoronto.ca/index.php/Slurm) |
| NLHPC | CL | [link](http://usuarios.nlhpc.cl/index.php/SLURM) |
| Kultrun | CL | [link](http://www.astro.udec.cl/kultrun) |
| Matbio | CL | [link](http://www.matbio.cl/cluster/) |
| TIG MIT | US | [link](https://tig.csail.mit.edu/shared-computing/slurm/) |
| MIT Supercloud | US | [link](https://supercloud.mit.edu/submitting-jobs) |
| Oxford's ARC | UK | [link](https://help.it.ox.ac.uk/arc/job-scheduling) |
## Funding
Supported by National Cancer Institute Grant #1P01CA196596.
Computation for the work described in this paper was supported by the University of Southern California’s Center for High-Performance Computing (hpcc.usc.edu).