Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add export_manifest function #3

Merged
merged 172 commits into from
Apr 25, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
172 commits
Select commit Hold shift + click to select a range
7310f25
Add export_manifest function
AlexAxthelm Mar 12, 2024
ae6e425
Add file metadata functions
AlexAxthelm Mar 12, 2024
71ef70a
Update docs
AlexAxthelm Mar 12, 2024
bc1d483
bump dev version
AlexAxthelm Mar 12, 2024
f216efa
add `digest` dependency
AlexAxthelm Mar 12, 2024
ef1f149
add `.github` directory to Rbuildignore
AlexAxthelm Mar 12, 2024
e95981f
Define log formatter for package
AlexAxthelm Mar 14, 2024
9af1650
Add .lintr config
AlexAxthelm Mar 14, 2024
c37466b
feat(app): #1 improve handling of metadata summary for unknown filetypes
AlexAxthelm Mar 14, 2024
3fe4b56
test(app): #1 Add tests for single file metadata
AlexAxthelm Mar 14, 2024
08ed2b0
test(app): #1 Suppress logging output during tests
AlexAxthelm Mar 14, 2024
cc7efc5
Add summary metadata support for JSON
AlexAxthelm Mar 14, 2024
0f5ec4f
Add handling for missing files
AlexAxthelm Mar 14, 2024
ed4c6d0
Move logger setup into test file
AlexAxthelm Mar 14, 2024
f038252
import logger functions
AlexAxthelm Mar 14, 2024
7c2fd54
Fix test for logging missing file error
AlexAxthelm Mar 14, 2024
cd71fe4
shorten line
AlexAxthelm Mar 14, 2024
86607a9
Remove testing for log output
AlexAxthelm Mar 14, 2024
e8f25ff
use actual digests, rather than hardcoded values
AlexAxthelm Mar 14, 2024
bd312ef
Update Rbuildignore
AlexAxthelm Mar 14, 2024
7c59bd0
add more tests for error cases
AlexAxthelm Mar 14, 2024
14f5514
Rename variables for consistency
AlexAxthelm Mar 14, 2024
ada5480
Check for actual filesize, rather than hardcoding
AlexAxthelm Mar 14, 2024
715ee07
remove trailing line
AlexAxthelm Mar 14, 2024
1163c93
Add tests for get_file_metadata()
AlexAxthelm Mar 14, 2024
d26f58b
Add more tests
AlexAxthelm Mar 14, 2024
44aded2
Add test for empty file
AlexAxthelm Mar 14, 2024
bb34626
update filesize check in tests
AlexAxthelm Mar 14, 2024
6007d9a
add check for NULL input
AlexAxthelm Mar 14, 2024
ba9b914
Update docs
AlexAxthelm Mar 14, 2024
d3c9ced
Add manifest creation datetime to output
AlexAxthelm Mar 18, 2024
f50be58
add function to get environment information
AlexAxthelm Mar 19, 2024
c43ca87
Add test for session info
AlexAxthelm Mar 19, 2024
bc3e60e
eliminate extra comma
AlexAxthelm Mar 19, 2024
d7297c3
Reindent
AlexAxthelm Mar 19, 2024
7a25821
Add tests for local package info
AlexAxthelm Mar 19, 2024
ac5a6f3
Add test for package info from GH source
AlexAxthelm Mar 19, 2024
b58e156
Add dependency on `pak`
AlexAxthelm Mar 19, 2024
85e46ec
linting
AlexAxthelm Mar 19, 2024
8c4d51c
Change testing flow for local packages
AlexAxthelm Mar 20, 2024
b6d28fe
add local R_USER_CACHE_DIR
AlexAxthelm Mar 20, 2024
c1a40a7
Change from pak to pkgdepends fo package info
AlexAxthelm Mar 20, 2024
506af48
Capture PAK output to make tests easier to read
AlexAxthelm Mar 20, 2024
2df69db
Sanitize package details before exporting
AlexAxthelm Mar 20, 2024
6e86e29
Add pak to testing deps
AlexAxthelm Mar 20, 2024
6281f19
copy pak functions before messing with libpaths
AlexAxthelm Mar 20, 2024
bf63e19
Fix madness with libpaths
AlexAxthelm Mar 20, 2024
0239085
bump dev version
AlexAxthelm Mar 20, 2024
0317d14
Add debugging informtaion to test ouput
AlexAxthelm Mar 20, 2024
0cad051
prefix libpath, not replace
AlexAxthelm Mar 20, 2024
738aeb6
Update Docs
AlexAxthelm Mar 20, 2024
05753b8
refactor local installs
AlexAxthelm Mar 20, 2024
24bbd96
refactor package info expectations
AlexAxthelm Mar 20, 2024
9788431
Allow matching for repo
AlexAxthelm Mar 20, 2024
1a5233b
Change from `git2r` to `gert`
AlexAxthelm Mar 20, 2024
4096ad8
Chagne dependency in DESCRIPTION
AlexAxthelm Mar 20, 2024
408825e
Change argument name
AlexAxthelm Mar 20, 2024
21b1b4f
explicitly define pak sources
AlexAxthelm Mar 20, 2024
e932448
better define matching regex
AlexAxthelm Mar 20, 2024
cd0fcb0
add libPaths to session info
AlexAxthelm Mar 20, 2024
09d54c8
Convert filepath sep in expectations
AlexAxthelm Mar 20, 2024
bfb3f7a
Add updated libpaths to tests
AlexAxthelm Mar 20, 2024
5079cc8
Add fix for remotepkgref on Windows
AlexAxthelm Mar 20, 2024
ef590c6
nolint
AlexAxthelm Mar 20, 2024
8825b76
Add test for missing package
AlexAxthelm Mar 20, 2024
41f40b5
Switch pattern and replacement in file path fix
AlexAxthelm Mar 20, 2024
930a8ad
add lintr exclusion
AlexAxthelm Mar 21, 2024
2249851
Add check for singularity
AlexAxthelm Mar 21, 2024
80f1c23
Add test for singularity
AlexAxthelm Mar 21, 2024
17d3782
Add check for package existence
AlexAxthelm Mar 21, 2024
df2f296
clean up unneeded nolints
AlexAxthelm Mar 21, 2024
286e770
add check for multiple installations of package
AlexAxthelm Mar 21, 2024
941044a
add test for libpaths down the search tree
AlexAxthelm Mar 21, 2024
5cab781
Fix behavior for down-search-path libraries
AlexAxthelm Mar 21, 2024
eafbc85
namespace `utils::installed.packages()`
AlexAxthelm Mar 21, 2024
4a8f591
fix IMCOMPLETE_STRING lintr error
AlexAxthelm Mar 21, 2024
daf3526
Linting
AlexAxthelm Mar 21, 2024
9aee81e
Fix failing test
AlexAxthelm Mar 21, 2024
631b926
Check version as part of multiple installs
AlexAxthelm Mar 21, 2024
21d7660
Attempt fixing windows paths
AlexAxthelm Mar 21, 2024
8367fb0
reduce strictness for platform check
AlexAxthelm Mar 21, 2024
f48d550
reflow tests when altering libpaths against current
AlexAxthelm Mar 21, 2024
aee5cc1
Reindent
AlexAxthelm Mar 21, 2024
72da85f
Fix paths on windows (again)
AlexAxthelm Mar 21, 2024
5b5cb77
Change placeholder return value
AlexAxthelm Mar 21, 2024
23a1c34
Add placeholder test
AlexAxthelm Mar 21, 2024
447532b
move functions into own file
AlexAxthelm Mar 21, 2024
fe0f9d2
Document session info
AlexAxthelm Mar 21, 2024
39d9716
update docstrings
AlexAxthelm Mar 21, 2024
cd67cc1
Update rendered docs
AlexAxthelm Mar 21, 2024
01f82b9
Rename test file
AlexAxthelm Mar 21, 2024
567a88d
Name testthat args
AlexAxthelm Mar 21, 2024
2bd9954
Allow NA fgor pkg ref (base pkgs)
AlexAxthelm Mar 21, 2024
ca3a8eb
Reflow docstrings
AlexAxthelm Mar 21, 2024
73f1773
improve tests
AlexAxthelm Mar 21, 2024
d9f2430
Add test for empty arguments
AlexAxthelm Mar 21, 2024
41bbce0
Add tests for `get_package_info()`
AlexAxthelm Mar 21, 2024
181edbd
Update docstrings
AlexAxthelm Mar 21, 2024
52470c2
update rendered docs
AlexAxthelm Mar 21, 2024
d0228c2
Reflow docstrings
AlexAxthelm Mar 21, 2024
092c957
Update rendered docs
AlexAxthelm Mar 21, 2024
26afb9c
Add tests for NULL arguments
AlexAxthelm Mar 22, 2024
9a7847e
change structure of outputs
AlexAxthelm Mar 22, 2024
c1d1f72
Update signature for get_package_info
AlexAxthelm Mar 25, 2024
ebc6517
update rendered docs
AlexAxthelm Mar 25, 2024
f2c816f
re-enable tests
AlexAxthelm Mar 25, 2024
86b0db7
Add test for mixed nesting
AlexAxthelm Mar 25, 2024
cc7f45d
lintr exclusion
AlexAxthelm Mar 25, 2024
c558112
Add handling for packages loaded with pkgload
AlexAxthelm Mar 25, 2024
ed66d62
Add test for `devtools::load_all()`
AlexAxthelm Mar 25, 2024
9addb26
Add pkgload dependency
AlexAxthelm Mar 25, 2024
eb69f3a
Add skips for missing packages
AlexAxthelm Mar 25, 2024
f6a54f2
Add warning handling
AlexAxthelm Mar 25, 2024
cb5a36f
Reindent
AlexAxthelm Mar 25, 2024
f54db91
Add devtools to suggests
AlexAxthelm Mar 25, 2024
f54e709
Check for pkgload, not CRAN for warning handler
AlexAxthelm Mar 25, 2024
a9b856a
Do not test for silence
AlexAxthelm Mar 25, 2024
69ac9bb
update rendered docs
AlexAxthelm Mar 25, 2024
876b4fb
Normalize path
AlexAxthelm Mar 25, 2024
d94cc6a
change output key name
AlexAxthelm Mar 26, 2024
eecbd85
Merge pull request #6 from RMI-PACTA/get-environment
AlexAxthelm Apr 2, 2024
5bc0d47
Add function to determine if path is in a git repo
AlexAxthelm Apr 2, 2024
720c86d
Add check for active branch and upstream
AlexAxthelm Apr 3, 2024
c73426a
Tests for git path
AlexAxthelm Apr 3, 2024
654d892
Add check for dirty/clean
AlexAxthelm Apr 3, 2024
f392039
Add test for git conflicts
AlexAxthelm Apr 3, 2024
036d4ba
linting
AlexAxthelm Apr 3, 2024
22fa3f3
suppressMessages in testing
AlexAxthelm Apr 3, 2024
42cdc63
Increment version number to 0.0.0.9003
AlexAxthelm Apr 3, 2024
5647cca
Change dependency from suggest to import
AlexAxthelm Apr 3, 2024
678621f
Add handling for git tags
AlexAxthelm Apr 3, 2024
ed89c34
Add git config step in testing
AlexAxthelm Apr 3, 2024
018dbed
fix output path on windows
AlexAxthelm Apr 3, 2024
50e3583
Add testing on cloned repos
AlexAxthelm Apr 3, 2024
c90b42d
Further fixes for windows paths
AlexAxthelm Apr 3, 2024
fb377fe
Linting
AlexAxthelm Apr 3, 2024
2188d28
More fixes for windows paths
AlexAxthelm Apr 3, 2024
d27d0d0
Add git info for relevant packages
AlexAxthelm Apr 3, 2024
ab1cb95
Add git info for local packages installed with pak
AlexAxthelm Apr 3, 2024
4260aaa
update get_package_info tests
AlexAxthelm Apr 3, 2024
ef573f8
Fix for paths on windows
AlexAxthelm Apr 3, 2024
42406cd
linting
AlexAxthelm Apr 3, 2024
c951bc9
Use testing git config as helper function
AlexAxthelm Apr 3, 2024
9ca08df
No not compile `pkgload`ed packages
AlexAxthelm Apr 8, 2024
3a361c0
Set envvar for pkgload
AlexAxthelm Apr 8, 2024
a08a58f
INclude suppression of `symbols.rds` for all tests
AlexAxthelm Apr 8, 2024
50c8ee3
Change testing package
AlexAxthelm Apr 8, 2024
0a2889a
Linting
AlexAxthelm Apr 8, 2024
10a062a
Merge pull request #9 from RMI-PACTA/git-utilities
AlexAxthelm Apr 8, 2024
bf989b7
Add params argument
AlexAxthelm Apr 9, 2024
8579481
Add environment info to manifest
AlexAxthelm Apr 9, 2024
e720d60
Update ref for remote repo
AlexAxthelm Apr 9, 2024
702a752
Update tests
AlexAxthelm Apr 9, 2024
3060df2
remove overly specific test
AlexAxthelm Apr 10, 2024
ed3f22c
Add handling for ... arguments for manifest
AlexAxthelm Apr 10, 2024
177ac17
Add tests for export
AlexAxthelm Apr 10, 2024
eb06e42
Export NULL rather than NA for missing details
AlexAxthelm Apr 10, 2024
218a271
Update rendered docs
AlexAxthelm Apr 10, 2024
4fcfdc8
fix flaky tests
AlexAxthelm Apr 10, 2024
6f60990
fix: deal with tzone attribute in test
AlexAxthelm Apr 19, 2024
c3d0d2a
linting
AlexAxthelm Apr 19, 2024
cd31ad4
fix: explicitly set tzone attr
AlexAxthelm Apr 19, 2024
7cb0f9a
use non-CRAN package for testing lower libpaths
AlexAxthelm Apr 19, 2024
443446c
add packagename to warning string
AlexAxthelm Apr 19, 2024
9e725d0
linting
AlexAxthelm Apr 19, 2024
66617a9
exclude failing test from `covr`
AlexAxthelm Apr 23, 2024
4853a9b
Revert "add packagename to warning string"
AlexAxthelm Apr 23, 2024
bcbfb6f
Add `covr` to Suggests
AlexAxthelm Apr 23, 2024
968358c
Update warning message
AlexAxthelm Apr 24, 2024
0677b2e
Update docstrings
AlexAxthelm Apr 24, 2024
1806739
add human-readable file size
AlexAxthelm Apr 24, 2024
37675e9
linting
AlexAxthelm Apr 24, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .Rbuildignore
Original file line number Diff line number Diff line change
@@ -1 +1,3 @@
.github/
^.lintr$
^LICENSE\.md$
4 changes: 4 additions & 0 deletions .lintr
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
linters: all_linters()
exclusions: list(
"tests/testthat.R"
)
18 changes: 16 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: pacta.workflow.utils
Title: Utility functions for PACTA workflows
Version: 0.0.0.9000
Version: 0.0.0.9003
Authors@R:
c(person(given = "Alex",
family = "Axthelm",
Expand All @@ -14,4 +14,18 @@ Description: Provide utility functions to be called across RMI-PACTA's workflows
License: MIT + file LICENSE
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.3.0
RoxygenNote: 7.3.1
Imports:
digest,
gert,
jsonlite,
logger,
pkgdepends,
pkgload
Suggests:
covr,
devtools,
pak,
testthat (>= 3.0.0),
withr
Config/testthat/edition: 3
6 changes: 6 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
@@ -1,2 +1,8 @@
# Generated by roxygen2: do not edit by hand

export(export_manifest)
importFrom(logger,log_debug)
importFrom(logger,log_error)
importFrom(logger,log_info)
importFrom(logger,log_trace)
importFrom(logger,log_warn)
104 changes: 104 additions & 0 deletions R/export_manifest.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
#' Export manifest file with metadata
#'
#' @param manifest_path Path to the manifest file.
#' @param input_files List or vector (named or unnamed) of files that are
#' inputs to the workflow. Passed to `[get_file_metadata()]`.
#' @param output_files List or vector (named or unnamed) of files that are
#' outputs from the workflow. Passed to `[get_file_metadata()]`.
#' @param params List parameters used to define the workflow.
#' @param ... Nested list to be included in manifest. Passed on to
#' `create_manifest`.
#'
#' @return (invisible) JSON string with metadata manifest.
#'
#' @export
export_manifest <- function(
manifest_path,
input_files,
output_files,
params,
...
) {

manifest_list <- create_manifest(
input_files = input_files,
output_files = output_files,
params = params,
...
)

manifest_json <- jsonlite::toJSON(
manifest_list,
pretty = TRUE,
auto_unbox = TRUE,
null = "null",
na = "string"
)

logger::log_debug("Writing metadata to file: ", manifest_path)
writeLines(
text = manifest_json,
con = manifest_path
)
return(invisible(manifest_json))
}

create_manifest <- function(
input_files,
output_files,
...
) {
log_debug("Creating metadata manifest")
log_trace("Checking ... arguments")
args_list <- list(...)
if (length(args_list) > 0L) {
clean_args <- check_arg_type(args_list)
}
manifest_list <- list(
input_files = get_file_metadata(input_files),
output_files = get_file_metadata(output_files),
envirionment = get_manifest_envirionment_info(),
manifest_creation_datetime = format.POSIXct(
x = Sys.time(),
format = "%F %R",
tz = "UTC",
usetz = TRUE
)
)
if (exists("clean_args")) {
manifest_list <- c(manifest_list, clean_args)
}
return(manifest_list)
}

# Check that arguments are nicely coercible to JSON. called for side effect of
# `stop` if not.
check_arg_type <- function(arg) {
log_trace("Checking argument type")
if (inherits(arg, "list")) {
if (
length(arg) != length(names(arg)) ||
any(names(arg) == "")
) {
log_error("elements of lists in ... must be named")
stop("unnamed arguments in ... of create_manifest (or in nested list)")
}
lapply(
X = arg,
FUN = check_arg_type
)
} else {
if (
inherits(arg, "character") ||
inherits(arg, "numeric") ||
inherits(arg, "integer") ||
inherits(arg, "logical")
) {
log_trace("arg is a simple type")
} else {
log_error("arg is not a simple type")
stop("Arguments in ... must be simple types")
}
}
return(arg)
}
33 changes: 33 additions & 0 deletions R/get_environment.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
#' Get Environment information for manifest
#'
#' This function takes no arguments and returns a nested list, suitable for
#' inclusion in manifest export.
#'
#' @return nested list of file details, length the same as the input vector.
get_manifest_envirionment_info <- function() {
environment_list <- list(
session = get_r_session_info(),
packages = get_package_info()
)
return(environment_list)
}

#' Get session information for manifest
#'
#' This function takes no arguments and returns a list, suitable for
#' inclusion in manifest export.
#'
#' @return list of session details, including R Version, platform, OS
#' (`running`), locale, timezone, and library paths.
get_r_session_info <- function() {
return(
list(
R.version = utils::sessionInfo()[["R.version"]],
platform = utils::sessionInfo()[["platform"]],
running = utils::sessionInfo()[["running"]],
locale = utils::sessionInfo()[["locale"]],
tzone = utils::sessionInfo()[["tzone"]],
libPaths = .libPaths() # nolint: undesirable_function_linter
)
)
}
104 changes: 104 additions & 0 deletions R/get_file_metadata.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
#' Get Metadata for a vector of filepaths
#'
#' This function takes a vector of filepaths and returns a list of file
#' details, suitable for inclusion in manifest export.
#'
#' @param filepaths vector of filepaths
#'
#' @return nested list of file details, length the same as the input vector.
get_file_metadata <- function(filepaths) {
logger::log_trace("Getting metadata for files.")

file_metadata <- lapply(filepaths, get_single_file_metadata)

return(file_metadata)
}

#' Get Metadata for a file
#'
#' This function takes a single filepaths and returns a list of file
#' details, suitable for inclusion in manifest export.
#'
#' @param filepath vector of filepaths
#'
#' @return list of file details
get_single_file_metadata <- function(filepath) {
if (length(filepath) > 1L) {
logger::log_error("get_single_file_metadata only accepts single files.")
stop("Only one file path can be passed to get_single_file_metadata.")
}
if (!file.exists(filepath)) {
logger::log_error("File does not exist: \"{filepath}\".")
stop("File does not exist.")
}

logger::log_trace("Getting metadata for file: \"{filepath}\".")

file_name <- basename(filepath)
file_extension <- tools::file_ext(filepath)
file_path <- filepath
file_size <- file.info(filepath)[["size"]]
class(file_size) <- "object_size"
file_last_modified <- format(
as.POSIXlt(file.info(filepath)[["mtime"]], tz = "UTC"),
"%Y-%m-%dT%H:%M:%S+00:00"
)
file_md5 <- digest::digest(filepath, algo = "md5", file = TRUE)

file_metadata <- list(
file_name = file_name,
file_extension = file_extension,
file_path = file_path,
file_size_human = format(
file_size,
units = "auto",
standard = "SI"
),
file_size = as.integer(file_size),
file_last_modified = file_last_modified,
file_md5 = file_md5
)

logger::log_trace("Getting summary information for file: \"{filepath}\".")
if (tolower(tools::file_ext(filepath)) == "rds") {
contents <- readRDS(filepath)
} else if (tolower(tools::file_ext(filepath)) == "csv") {
contents <- utils::read.csv(filepath)
AlexAxthelm marked this conversation as resolved.
Show resolved Hide resolved
} else if (tolower(tools::file_ext(filepath)) == "json") {
contents <- jsonlite::fromJSON(filepath)
} else {
logger::log_trace(
"File not supported for summary information: \"{filepath}\"."
)
contents <- NULL
}
# expecting a data.frame for output files
if (inherits(contents, "data.frame")) {
summary_info <- list(
nrow = nrow(contents),
colnames = colnames(contents),
class = class(contents)
)
} else if (inherits(contents, "list")) {
summary_info <- list(
length = length(contents),
names = names(contents),
class = class(contents)
)
} else if (!is.null(contents)) {
logger::log_trace(
"Only data.frame and list objects supported for summary information."
)
summary_info <- list(
class = class(contents)
)
} else {
summary_info <- NULL
}

if (exists("summary_info")) {
file_metadata[["summary_info"]] <- summary_info
}

return(file_metadata)
}
Loading
Loading