-
Notifications
You must be signed in to change notification settings - Fork 13
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* First mockup of `hardcode_no_ct()` * Update `hardcode_no_ct()` Update `hardcode_no_ct()` by allowing the rewriting of the `target_sdtm_variable` variable to preserve `NA` * Align `hardcode_no_ct()` code style with Ramm's expectations * Add `hardcode_*()` and `assign_*()` functions * hardcode_no_ct algorithm code changes (#45) * hardcode_no_ct algorithm code changes * harcode_ct working as expected * assign_ct and assign_no_ct works great. * address review comments * Add `oak_id_vars()` * Fix typo in `recode()` * Simplify `oak_id_vars()` docs * Update `assign_*` and `hardcode_*` implementations * Introduce memoisation of `ct_mappings()` * Update of README introductory paragraph * Update hardcode_* functions' interface * Add `contains_oak_id_vars()` function * Update `contains_oak_id_vars()` doc examples * Update `sdtm_harcode()` and dependant functions * Update `assign_*` and `hardcore_*` related functions * Automatic renv profile update. * Automatic renv profile update. * Make `ct` and `cl` parameters mandatory for `assign_ct()` * Add functions ct importing - Adds three new user facing ct-related functions: `read_ct_example()`, `ct_example()` and `read_ct()` - Provides a ct example file in inst/ct/ * Bring `hardcode*()` and `assign*()` related assertions closer to user calling functions * Add lagging behind Rd for `ct_example()` * Add `assert_ct()` * Add ct assertions * Remove R/.gitkeep As it is no longer needed. * Add unit tests for `ct_vars()` * Update dependencies * Export `ct_vars()` Export `ct_vars()` such that we can cross-reference it from other functions' documentation. * Update `assert_ct()` docs * Clarify `assign_ct()`/`assign_no_ct()` doc * Improve grammar in doc * Remove last empty line from ct example file * Add documentation to `sdtm_assign()` and ct-related unit tests Although we had discussed to keep assertions only at the user facing functions, I am getting the feeling we would miss assertions also at the internal function... because of several reasons: firstly, the internal function is more flexible having more optional parameters, which requires extra assertion logic, and also because eventually we will be checking code coverage and we will regret not having done this now. * Update hardcode-related fns * Changes to meet linter issues * Code reformatting * Code reflow * Improve `assert_cl()` docs * Update `read_ct()` docs * Automatic renv profile update. * Automatic renv profile update. * Add units tests for `recode()` * Remove `are_to_recode()` function Ended up not using this function. * Add units tests for `assert_ct()` * Add one more test for `assert_ct()` * Add a basic unit test for `ct_mappings()` * Fill in some doc details of ct-related functions * Remove leftover doc text in `assign` * Update website's reference * Styling update * Bump version and update NEWS * Fix a few lintr issues * Add examples to `ct_map()` doc * Fix typo in `problems()` doc * Fix typo * Initial mockup of `assign_datetime()` * Add `.warn` parameter to `create_iso8601()` internals * Remove lint issues * Replace `.data` usage in tidyselect expressions See tidyverse/tidyverse.org#600 for more details. * Variable renaming - `ct` to `ct_spec` (ct specification) - `cl` to `ct_cltc` (codelist code) * Finish pending renaming of variables * Rename code-list to codelist * Fix style * Fix style * Add assertions to `assign_datetime()` * Add merge example to `assign_datetime()` doc * Style changes * Style changes (.Rd) * Bump version and update news * Update `ct_map()` doc example * Make tibbles more readable in doc examples * Rename `ct_cltc` to `ct_clst` As per @rammprasad's suggestion. * Stash changes so far * Fix bug in `assign_datetime` - This bug is related to the support of input is in two different variables (date and time). - A unit test was also added * Linting * Update styling * Add renv profile for R version 4.4 * Fix a few issues in `domain_example()` docs * Comment `derive_seq()` Comment `derive_seq()` function for the moment for testing purposes with new R version. * Add example with date and time to `assign_datetime()` docs * Avoid backslash hell (մերսի) Credit goes to @edgar-manukya for the expression * Update `ct_spec_vars()` docs' examples `ct_spec_vars()` used to be an internal function but not anymore: so no need for `:::`. * Fix typo in `assign_datetime()` documentation * Fix typo in `ct_map()` docs * Reformating of docs of `derive_seq()` * First mockup of `derive_seq()` * Simplification of `derive_seq()` interface `derive_seq()` no longer has a default value for parameter `id_vars` as per the meeting of 2024-05-15, so function's logic has been simplified accordingly. * Bump dev version and update NEWS * Automatic renv profile update. * Update `derive_seq()` - Two examples of domain data sets were added: vs and apsc - An internal data set `domain_record_vars` provides the mapping between domain names and key variables (thank you @edgar-manukyan) - `rec_vars()` was added to access those record-defining variables - Likewise, `sbj_vars()` for subject-level defining rows * Automatic renv profile update. * Strip down the domain example data sets * Remove `rec_vars()` * Fix linting issues * Linting issues (still) * Fix code styling issues * Addressing Edgar's review - `derive_seq()` checks the name of `tgt_var` - Added comment to clarify the code * Add `sbj_vars()` to pkgdown reference list --------- Co-authored-by: Ram Ganapathy <[email protected]> Authored-by: ramiromagno <[email protected]>
- Loading branch information
1 parent
f9816fb
commit e1aa479
Showing
25 changed files
with
1,504 additions
and
381 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,7 @@ | ||
Package: sdtm.oak | ||
Type: Package | ||
Title: SDTM Data Transformation Engine | ||
Version: 0.0.0.9003 | ||
Version: 0.0.0.9004 | ||
Authors@R: c( | ||
person("Rammprasad", "Ganapathy", role = c("aut", "cre"), | ||
email = "[email protected]"), | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,97 @@ | ||
#' Derive the sequence number (`--SEQ`) variable | ||
#' | ||
#' @description | ||
#' [derive_seq()] creates a new identifier variable: the sequence number | ||
#' (`--SEQ`). | ||
#' | ||
#' This function adds a newly derived variable to `tgt_dat`, namely the sequence | ||
#' number (`--SEQ`) whose name is the one provided in `tgt_var`. An integer | ||
#' sequence is generated that uniquely identifies each record within the domain. | ||
#' | ||
#' Prior to the derivation of `tgt_var`, the data frame `tgt_dat` is sorted | ||
#' according to grouping variables indicated in `rec_vars`. | ||
#' | ||
#' @param tgt_dat The target dataset, a data frame. | ||
#' @param tgt_var The target SDTM variable: a single string indicating the name | ||
#' of the sequence number (`--SEQ`) variable, e.g. `"DSSEQ"`. Note that | ||
#' supplying a name not ending in `"SEQ"` will raise a warning. | ||
#' @param rec_vars A character vector of record-level identifier variables. | ||
#' @param sbj_vars A character vector of subject-level identifier variables. | ||
#' @param start_at The sequence numbering starts at this value (default is `1`). | ||
#' | ||
#' @returns Returns the data frame supplied in `tgt_dat` with the newly derived | ||
#' variable, i.e. the sequence number (`--SEQ`), whose name is that passed in | ||
#' `tgt_var`. This variable is of type integer. | ||
#' | ||
#' @examples | ||
#' # A VS raw data set example | ||
#' (vs <- read_domain_example("vs")) | ||
#' | ||
#' # Derivation of VSSEQ | ||
#' rec_vars <- c("STUDYID", "USUBJID", "VSTESTCD", "VSDTC", "VSTPTNUM") | ||
#' derive_seq(tgt_dat = vs, tgt_var = "VSSEQ", rec_vars = rec_vars) | ||
#' | ||
#' # An APSC raw data set example | ||
#' (apsc <- read_domain_example("apsc")) | ||
#' | ||
#' # Derivation of APSEQ | ||
#' derive_seq( | ||
#' tgt_dat = apsc, | ||
#' tgt_var = "APSEQ", | ||
#' rec_vars = c("STUDYID", "RSUBJID", "SCTESTCD"), | ||
#' sbj_vars = c("STUDYID", "RSUBJID") | ||
#' ) | ||
#' @export | ||
derive_seq <- | ||
function(tgt_dat, | ||
tgt_var, | ||
rec_vars, | ||
sbj_vars = sdtm.oak::sbj_vars(), | ||
start_at = 1L) { | ||
admiraldev::assert_character_scalar(tgt_var) | ||
if (!is_seq_name(tgt_var)) { | ||
rlang::warn("Target variable name (`tgt_var`) should end in 'SEQ'.") | ||
} | ||
|
||
admiraldev::assert_character_vector(rec_vars) | ||
admiraldev::assert_character_vector(sbj_vars) | ||
admiraldev::assert_data_frame(tgt_dat, | ||
required_vars = rlang::syms(rec_vars), | ||
optional = FALSE | ||
) | ||
|
||
admiraldev::assert_integer_scalar(start_at, subset = "non-negative") | ||
|
||
tgt_dat |> | ||
# Ensure that no prior grouping exists that alters ordering and new | ||
# grouping. | ||
dplyr::ungroup() |> | ||
dplyr::arrange(dplyr::across(.cols = dplyr::all_of(rec_vars))) |> | ||
dplyr::group_by(dplyr::across(dplyr::all_of(sbj_vars))) |> | ||
dplyr::mutate("{tgt_var}" := dplyr::row_number() + start_at - 1L) |> # nolint object_name_linter() | ||
dplyr::ungroup() | ||
} | ||
|
||
#' Is it a --SEQ variable name | ||
#' | ||
#' [is_seq_name()] returns which variable names end in `"SEQ"`. | ||
#' | ||
#' @param x A character vector. | ||
#' | ||
#' @returns A logical vector. | ||
#' | ||
#' @examples | ||
#' # A valid SEQ name. | ||
#' sdtm.oak:::is_seq_name("AESEQ") | ||
#' | ||
#' # Not valid sequence number (`--SEQ`) variable names. | ||
#' # Case matters. | ||
#' sdtm.oak:::is_seq_name("AEseq") | ||
#' | ||
#' # A valid name has to end in "SEQ". | ||
#' sdtm.oak:::is_seq_name("AESEQUENCE") | ||
#' | ||
#' @keywords internal | ||
is_seq_name <- function(x) { | ||
stringr::str_detect(x, "SEQ$") | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,129 @@ | ||
#' Find the path to an example SDTM domain file | ||
#' | ||
#' @description | ||
#' [domain_example()] resolves the local path to a SDTM domain example file. The | ||
#' domain examples files were imported from | ||
#' [pharmaversesdtm](https://cran.r-project.org/package=pharmaversesdtm). See | ||
#' Details section for available datasets. | ||
#' | ||
#' @details | ||
#' Datasets were obtained from | ||
#' [pharmaversesdtm](https://cran.r-project.org/package=pharmaversesdtm) but are | ||
#' originally sourced from the [CDISC pilot | ||
#' project](https://github.com/cdisc-org/sdtm-adam-pilot-project) or have been | ||
#' constructed ad-hoc by the | ||
#' [admiral](https://cran.r-project.org/package=admiral) team. These datasets | ||
#' are bundled with `{sdtm.oak}`, thus obviating a dependence on | ||
#' `{pharmaversesdtm}`. | ||
#' | ||
#' ### Example SDTM domains | ||
#' | ||
#' \describe{ | ||
#' \item{`"ae_ophtha"`}{Ophthalmology Adverse Events Dataset.} | ||
#' \item{`"ae"`}{Adverse Events Dataset-updated.} | ||
#' \item{`"ce_vaccine"`}{Clinical Events Dataset for Vaccine Studies.} | ||
#' \item{`"cm"`}{Concomitant Medication Dataset.} | ||
#' \item{`"dm_vaccine"`}{Demographics Dataset for Vaccine Studies.} | ||
#' \item{`"dm"`}{Demography Dataset.} | ||
#' \item{`"ds"`}{Disposition Dataset-updated.} | ||
#' \item{`"eg"`}{ Electrocardiogram Dataset.} | ||
#' \item{`"ex_ophtha"`}{Ophthalmology Exposure Dataset.} | ||
#' \item{`"ex_vaccine"`}{Exposures Dataset for Vaccine Studies.} | ||
#' \item{`"ex"`}{Exposure Dataset.} | ||
#' \item{`"face_vaccine"`}{Findings About Clinical Events Dataset for Vaccine Studies.} | ||
#' \item{`"is_vaccine"`}{Immunogenicity Specimen Assessments Dataset for Vaccine Studies.} | ||
#' \item{`"lb"`}{Laboratory Measurements Dataset.} | ||
#' \item{`"mh"`}{Medical History Dataset-updated.} | ||
#' \item{`"oe_ophtha"`}{Ophthalmology Adverse Events Dataset.} | ||
#' \item{`"pc"`}{Pharmacokinetics Concentrations Dataset.} | ||
#' \item{`"pp"`}{Pharmacokinetics Parameters Dataset.} | ||
#' \item{`"qs_ophtha"`}{Ophthalmology Questionnaire Dataset.} | ||
#' \item{`"rs_onco_irecist"`}{Disease Response Dataset (iRECIST).} | ||
#' \item{`"rs_onco"`}{Disease Response Dataset.} | ||
#' } | ||
#' | ||
#' @param example A string with either the basename, file name, or relative path | ||
#' to a SDTM domain example file bundled with `{stdm.oak}`, e.g. `"cm"` | ||
#' (Concomitant Medication) or `"ae"` (Adverse Events). | ||
#' | ||
#' @returns The local path to an example file if `example` is supplied, or a | ||
#' character vector of example file names. | ||
#' | ||
#' @examples | ||
#' # If no example is provided it returns a vector of possible choices. | ||
#' domain_example() | ||
#' | ||
#' # Get the local path to the Concomitant Medication dataset file. | ||
#' domain_example("cm") | ||
#' | ||
#' # Local path to the Adverse Events dataset file. | ||
#' domain_example("ae") | ||
#' | ||
#' @source See \url{https://cran.r-project.org/package=pharmaversesdtm}. | ||
#' | ||
#' @seealso [read_domain_example()] | ||
#' @export | ||
domain_example <- function(example) { | ||
# If no example is requested, then return all available files. | ||
if (missing(example)) { | ||
domain_path <- system.file("domain", package = "sdtm.oak", mustWork = TRUE) | ||
domain_files <- list.files(domain_path, pattern = "*.rds") | ||
domains <- tools::file_path_sans_ext(basename(domain_files)) | ||
return(domains) | ||
} | ||
|
||
# Otherwise, resolve the local path to the example requested. | ||
admiraldev::assert_character_scalar(example, optional = TRUE) | ||
base_name <- tools::file_path_sans_ext(basename(example)) | ||
path <- file.path("domain", paste0(base_name, ".rds")) | ||
local_path <- system.file(path, package = "sdtm.oak") | ||
|
||
if (identical(local_path, "")) { | ||
stop( | ||
glue::glue( | ||
"'{example}' does not match any domain example files. Run `domain_example()` for options." | ||
), | ||
call. = FALSE | ||
) | ||
} else { | ||
local_path <- | ||
system.file(path, package = "sdtm.oak", mustWork = TRUE) | ||
return(local_path) | ||
} | ||
} | ||
|
||
#' Read an example SDTM domain | ||
#' | ||
#' [read_domain_example()] imports one of the bundled SDTM domain examples | ||
#' as a [tibble][tibble::tibble-package] into R. See [domain_example()] for | ||
#' possible choices. | ||
#' | ||
#' @param example The name of SDTM domain example, e.g. `"cm"` (Concomitant | ||
#' Medication) or `"ae"` (Adverse Events). Run `read_domain_example()` for | ||
#' available example files. | ||
#' | ||
#' @returns A [tibble][tibble::tibble-package] with an SDTM domain dataset, or a | ||
#' character vector of example file names. | ||
#' | ||
#' @examples | ||
#' # Leave the `example` parameter as missing for available example files. | ||
#' read_domain_example() | ||
#' | ||
#' # Read the example Concomitant Medication domain. | ||
#' read_domain_example("cm") | ||
#' | ||
#' # Read the example Adverse Events domain. | ||
#' read_domain_example("ae") | ||
#' | ||
#' @seealso [domain_example()] | ||
#' @export | ||
read_domain_example <- function(example) { | ||
if (missing(example)) { | ||
return(domain_example()) | ||
} else { | ||
admiraldev::assert_character_scalar(example) | ||
} | ||
|
||
path <- domain_example(example) | ||
readr::read_rds(file = path) | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
#' Subject-level key variables | ||
#' | ||
#' [sbj_vars()] returns the set of variable names that uniquely define | ||
#' a subject. | ||
#' | ||
#' @returns A character vector of variable names. | ||
#' | ||
#' @examples | ||
#' sbj_vars() | ||
#' | ||
#' @export | ||
sbj_vars <- function() { | ||
c("STUDYID", "USUBJID") | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
# Title: SDTM domain example datasets. | ||
|
||
library(pharmaversesdtm) | ||
library(readr) | ||
library(here) | ||
|
||
path <- here::here("inst/domain") | ||
|
||
vs <- tibble::tribble( | ||
~STUDYID, ~DOMAIN, ~USUBJID, ~VSSPID, ~VSTESTCD, ~VSDTC, ~VSTPTNUM, | ||
"ABC123", "VS", "ABC123-375", "/F:VTLS1-D:9795532-R:2", "DIABP", "2020-09-01T13:31", NA, | ||
"ABC123", "VS", "ABC123-375", "/F:VTLS1-D:9795532-R:2", "TEMP", "2020-09-01T13:31", NA, | ||
"ABC123", "VS", "ABC123-375", "/F:VTLS2-D:9795533-R:2", "DIABP", "2020-09-28T11:00", 2L, | ||
"ABC123", "VS", "ABC123-375", "/F:VTLS2-D:9795533-R:2", "TEMP", "2020-09-28T11:00", 2L, | ||
"ABC123", "VS", "ABC123-376", "/F:VTLS1-D:9795591-R:1", "DIABP", "2020-09-20", NA, | ||
"ABC123", "VS", "ABC123-376", "/F:VTLS1-D:9795591-R:1", "TEMP", "2020-09-20", NA | ||
) | ||
|
||
apsc <- tibble::tribble( | ||
~STUDYID, ~RSUBJID, ~SCTESTCD, ~DOMAIN, ~SREL, ~SCCAT, | ||
"ABC123", "ABC123-210", "LVSBJIND", "APSC", "FRIEND", "CAREGIVERSTUDY", | ||
"ABC123", "ABC123-210", "EDULEVEL", "APSC", "FRIEND", "CAREGIVERSTUDY", | ||
"ABC123", "ABC123-210", "TMSPPT", "APSC", "FRIEND", "CAREGIVERSTUDY", | ||
"ABC123", "ABC123-211", "CAREDUR", "APSC", "SIBLING", "CAREGIVERSTUDY", | ||
"ABC123", "ABC123-211", "LVSBJIND", "APSC", "SIBLING", "CAREGIVERSTUDY", | ||
"ABC123", "ABC123-212", "JOBCLAS", "APSC", "SPOUSE", "CAREGIVERSTUDY" | ||
) | ||
|
||
readr::write_rds(x = pharmaversesdtm::ae, file = file.path(path, "ae.rds"), compress = "xz") | ||
readr::write_rds(x = pharmaversesdtm::cm, file = file.path(path, "cm.rds"), compress = "xz") | ||
readr::write_rds(x = vs, file = file.path(path, "vs.rds"), compress = "xz") | ||
readr::write_rds(x = apsc, file = file.path(path, "apsc.rds"), compress = "xz") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -25,3 +25,7 @@ AE | |
AESTDY | ||
CMSTDY | ||
DM | ||
Immunogenicity | ||
Pharmacokinetics | ||
iRECIST | ||
pharmaversesdtm |
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Oops, something went wrong.