The whirl package provide functionalities for executing scripts in batch and simultaneously getting a log from the individual executions. A log from script execution is in many pharmaceutical companies a GxP requirement, and the whirl package honors this requirement by generating a log that, among other things, contains information about:
- Status (did the script run with any error or warnings)
- The actual code itself
- Date and time of execution
- The environment the script was executed under (session info)
- Information about packages versions that was utilized
- Environmental variables
And all this is wrapped into a nicely formatted html document that is easy to navigate.
The main function in the whirl package is run()
which takes an input
argument that defines the scripts to be executed.
The simplest way is to provide the path to a single script:
library(whirl)
run("success.R")
#> ✔ success.R: Completed succesfully
It is also possible to run several scripts simultaneously:
result <- run(c("success.R", "warning.R"), n_workers = 2)
#> ✔ success.R: Completed succesfully
#> ⚠ warning.R: Completed with warnings
Here we are specifying that run()
can use up to two simultaneous
workers to execute the scripts, meaning that they will be executed in
parallel.
When using run()
the following files are created:
- Creates a log in the same directory as the script with the names
{script_name}_log.html
. See example_log.html for an example of a simple log. - Creates a summary log with the overall status of each script.
Default path is
summary.html
. See summary.html for an example of a summary of the same log as above.
Apart from this the function also returns a tibble
with the status of
the script execution similar to the content of the summary above:
print(result)
#> # A tibble: 2 × 5
#> id tag script status result
#> <dbl> <chr> <chr> <chr> <list>
#> 1 1 <NA> /private/var/folders/fx/71by3f551qzb5wkxt82cv… succe… <named list>
#> 2 2 <NA> /private/var/folders/fx/71by3f551qzb5wkxt82cv… warni… <named list>
run()
also supports running scripts in several sequential steps. This
setup is very useful when your projects have several steps that depends
on each others output, and thereby need to be executed in a specific
order. The best way to implement this in your project is use a
configuration file for whirl. The configuration file is a yaml
file
that specifies each steps:
_whirl.yaml:
steps:
- name: "First step"
paths:
- "success.R"
- name: "Second step"
paths:
- "warning.R"
- "error.R"``
Here we are specifying that in the first step we run succes.R
. And
then when this step has been completed we continue to running the
scripts in the second steps.
result <- run("_whirl.yaml", n_workers = 2)
#> ✔ success.R: Completed succesfully
#> ⚠ warning.R: Completed with warnings
#> ✖ error.R: Completed with errors
print(result)
#> # A tibble: 3 × 5
#> id tag script status result
#> <dbl> <chr> <chr> <chr> <list>
#> 1 1 <NA> /private/var/folders/fx/71by3f551qzb5wkxt82cv… succe… <named list>
#> 2 2 <NA> /private/var/folders/fx/71by3f551qzb5wkxt82cv… warni… <named list>
#> 3 3 <NA> /private/var/folders/fx/71by3f551qzb5wkxt82cv… error <named list>
For more information about how to customize the the execution and the logging for your needs see the following:
run()
: For further information on how to call it.vignette("whirl")
: For a more in depth explanation, and more advanced usage.vignette("articles/example")
: With a simple example, including the created log.options()
: On how to change the default behavior of whirl.