diff --git a/v0.9.1/404.html b/v0.9.1/404.html new file mode 100644 index 0000000000..3277e85182 --- /dev/null +++ b/v0.9.1/404.html @@ -0,0 +1,87 @@ + + +
+ + + + +.github/CODE_OF_CONDUCT.md
+ We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, caste, color, religion, or sexual identity and orientation.
+We pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community.
+Examples of behavior that contributes to a positive environment for our community include:
+Examples of unacceptable behavior include:
+Community leaders are responsible for clarifying and enforcing our standards of acceptable behavior and will take appropriate and fair corrective action in response to any behavior that they deem inappropriate, threatening, offensive, or harmful.
+Community leaders have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, and will communicate reasons for moderation decisions when appropriate.
+This Code of Conduct applies within all community spaces, and also applies when an individual is officially representing the community in public spaces. Examples of representing our community include using an official e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event.
+Instances of abusive, harassing, or otherwise unacceptable behavior may be reported to the community leaders responsible for enforcement at [INSERT CONTACT METHOD]. All complaints will be reviewed and investigated promptly and fairly.
+All community leaders are obligated to respect the privacy and security of the reporter of any incident.
+Community leaders will follow these Community Impact Guidelines in determining the consequences for any action they deem in violation of this Code of Conduct:
+Community Impact: Use of inappropriate language or other behavior deemed unprofessional or unwelcome in the community.
+Consequence: A private, written warning from community leaders, providing clarity around the nature of the violation and an explanation of why the behavior was inappropriate. A public apology may be requested.
+Community Impact: A violation through a single incident or series of actions.
+Consequence: A warning with consequences for continued behavior. No interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, for a specified period of time. This includes avoiding interactions in community spaces as well as external channels like social media. Violating these terms may lead to a temporary or permanent ban.
+Community Impact: A serious violation of community standards, including sustained inappropriate behavior.
+Consequence: A temporary ban from any sort of interaction or public communication with the community for a specified period of time. No public or private interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, is allowed during this period. Violating these terms may lead to a permanent ban.
+Community Impact: Demonstrating a pattern of violation of community standards, including sustained inappropriate behavior, harassment of an individual, or aggression toward or disparagement of classes of individuals.
+Consequence: A permanent ban from any sort of public interaction within the community.
+This Code of Conduct is adapted from the Contributor Covenant, version 2.1, available at https://www.contributor-covenant.org/version/2/1/code_of_conduct.html.
+Community Impact Guidelines were inspired by Mozilla’s code of conduct enforcement ladder.
+For answers to common questions about this code of conduct, see the FAQ at https://www.contributor-covenant.org/faq. Translations are available at https://www.contributor-covenant.org/translations.
+.github/CONTRIBUTING.md
+ 🙏 Thank you for taking the time to contribute!
+Your input is deeply valued, whether an issue, a pull request, or even feedback, regardless of size, content or scope.
+ +Please refer the project documentation for a brief introduction. Please also see other articles within the project documentation for additional information.
+A Code of Conduct governs this project. Participants and contributors are expected to follow the rules outlined therein.
+All your contributions will be covered by this project’s license.
+We use GitHub to track issues, feature requests, and bugs. Before submitting a new issue, please check if the issue has already been reported. If the issue already exists, please upvote the existing issue 👍.
+For new feature requests, please elaborate on the context and the benefit the feature will have for users, developers, or other relevant personas.
+This repository uses the GitHub Flow model for collaboration. To submit a pull request:
+Create a branch
+Please see the branch naming convention below. If you don’t have write access to this repository, please fork it.
+Make changes
+Make sure your code
+Create a pull request (PR)
+In the pull request description, please link the relevant issue (if any), provide a detailed description of the change, and include any assumptions.
+Address review comments, if any
Post approval
+Merge your PR if you have write access. Otherwise, the reviewer will merge the PR on your behalf.
+Pat yourself on the back
+Congratulations! 🎉 You are now an official contributor to this project! We are grateful for your contribution.
+Suppose your changes are related to a current issue in the current project; please name your branch as follows: <issue_id>_<short_description>
. Please use underscore (_
) as a delimiter for word separation. For example, 420_fix_ui_bug
would be a suitable branch name if your change is resolving and UI-related bug reported in issue number 420
in the current project.
If your change affects multiple repositories, please name your branches as follows: <issue_id>_<issue_repo>_<short description>
. For example, 69_awesomeproject_fix_spelling_error
would reference issue 69
reported in project awesomeproject
and aims to resolve one or more spelling errors in multiple (likely related) repositories.
monorepo
and staged.dependencies
+Sometimes you might need to change upstream dependent package(s) to be able to submit a meaningful change. We are using staged.dependencies
functionality to simulate a monorepo
behavior. The dependency configuration is already specified in this project’s staged_dependencies.yaml
file. You need to name the feature branches appropriately. This is the only exception from the branch naming convention described above.
Please refer to the staged.dependencies package documentation for more details.
+This repository follows some unified processes and standards adopted by its maintainers to ensure software development is carried out consistently within teams and cohesively across other repositories.
+This repository follows the standard tidyverse
style guide and uses lintr
for lint checks. Customized lint configurations are available in this repository’s .lintr
file.
Lightweight is the right weight. This repository follows tinyverse recommedations of limiting dependencies to minimum.
+If the code is not compatible with all (!) historical versions of a given dependenct package, it is required to specify minimal version in the DESCRIPTION
file. In particular: if the development version requires (imports) the development version of another package - it is required to put abc (>= 1.2.3.9000)
.
We continuously test our packages against the newest R version along with the most recent dependencies from CRAN and BioConductor. We recommend that your working environment is also set up in the same way. You can find the details about the R version and packages used in the R CMD check
GitHub Action execution log - there is a step that prints out the R sessionInfo()
.
If you discover bugs on older R versions or with an older set of dependencies, please create the relevant bug reports.
+pre-commit
We highly recommend that you use the pre-commit
tool combined with R hooks for pre-commit
to execute some of the checks before committing and pushing your changes.
Pre-commit hooks are already available in this repository’s .pre-commit-config.yaml
file.
As mentioned previously, all contributions are deeply valued and appreciated. While all contribution data is available as part of the repository insights, to recognize a significant contribution and hence add the contributor to the package authors list, the following rules are enforced:
+git blame
query) OR*Excluding auto-generated code, including but not limited to roxygen
comments or renv.lock
files.
The package maintainer also reserves the right to adjust the criteria to recognize contributions.
+Copyright 2022 F. Hoffmann-La Roche AG + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. ++ +
SECURITY.md
+ If you believe you have found a security vulnerability in any of the repositories in this organization, please report it to us through coordinated disclosure.
+Please do not report security vulnerabilities through public GitHub issues, discussions, or pull requests.
+Instead, please send an email to vulnerability.management[@]roche.com.
+Please include as much of the information listed below as you can to help us better understand and resolve the issue:
+This information will help us triage your report more quickly.
+vignettes/generate_tmc_test_data.Rmd
+ generate_tmc_test_data.Rmd
teal.modules.clinical
+The following script is used to create and save cached synthetic
+CDISC
data to the data/
directory to use in
+examples and tests in the teal.modules.clinical
package.
+This script/vignette was initialized by Emily de la Rua in
+tern
.
Disclaimer: this vignette concerns mainly the development of +minimal and stable test data and it is kept internal for feature +tracking.
+
+library(dplyr)
+library(teal.data)
+
+study_duration_secs <- lubridate::seconds(lubridate::years(2))
+
+sample_fct <- function(x, N, ...) {
+ checkmate::assert_number(N)
+ factor(sample(x, N, replace = TRUE, ...), levels = if (is.factor(x)) levels(x) else x)
+}
+
+retain <- function(df, value_var, event, outside = NA) {
+ indices <- c(1, which(event == TRUE), nrow(df) + 1)
+ values <- c(outside, value_var[event == TRUE])
+ rep(values, diff(indices))
+}
+
+relvar_init <- function(relvar1, relvar2) {
+ if (length(relvar1) != length(relvar2)) {
+ message(simpleError(
+ "The argument value length of relvar1 and relvar2 differ. They must contain the same number of elements."
+ ))
+ return(NA)
+ }
+ return(list("relvar1" = relvar1, "relvar2" = relvar2))
+}
+
+rel_var <- function(df = NULL, var_name = NULL, var_values = NULL, related_var = NULL) {
+ if (is.null(df)) {
+ message("Missing data frame argument value.")
+ return(NA)
+ } else {
+ n_relvar1 <- length(unique(df[, related_var, drop = TRUE]))
+ n_relvar2 <- length(var_values)
+ if (n_relvar1 != n_relvar2) {
+ message(paste("Unequal vector lengths for", related_var, "and", var_name))
+ return(NA)
+ } else {
+ relvar1 <- unique(df[, related_var, drop = TRUE])
+ relvar2_values <- rep(NA, nrow(df))
+ for (r in seq_len(length(relvar1))) {
+ matched <- which(df[, related_var, drop = TRUE] == relvar1[r])
+ relvar2_values[matched] <- var_values[r]
+ }
+ return(relvar2_values)
+ }
+ }
+}
+
+visit_schedule <- function(visit_format = "WEEK",
+ n_assessments = 10L,
+ n_days = 5L) {
+ if (!(toupper(visit_format) %in% c("WEEK", "CYCLE"))) {
+ message("Visit format value must either be: WEEK or CYCLE")
+ return(NA)
+ }
+ if (toupper(visit_format) == "WEEK") {
+ assessments <- 1:n_assessments
+ assessments_ord <- -1:n_assessments
+ visit_values <- c("SCREENING", "BASELINE", paste(toupper(visit_format), assessments, "DAY", (assessments * 7) + 1))
+ } else if (toupper(visit_format) == "CYCLE") {
+ cycles <- sort(rep(1:n_assessments, times = 1, each = n_days))
+ days <- rep(seq(1:n_days), times = n_assessments, each = 1)
+ assessments_ord <- 0:(n_assessments * n_days)
+ visit_values <- c("SCREENING", paste(toupper(visit_format), cycles, "DAY", days))
+ }
+ visit_values <- stats::reorder(factor(visit_values), assessments_ord)
+}
+
+rtpois <- function(n, lambda) stats::qpois(stats::runif(n, stats::dpois(0, lambda), 1), lambda)
+
+rtexp <- function(n, rate, l = NULL, r = NULL) {
+ if (!is.null(l)) {
+ l - log(1 - stats::runif(n)) / rate
+ } else if (!is.null(r)) {
+ -log(1 - stats::runif(n) * (1 - exp(-r * rate))) / rate
+ } else {
+ stats::rexp(n, rate)
+ }
+}
+
+str_extract <- function(string, pattern) regmatches(string, gregexpr(pattern, string))
+
+with_label <- function(x, label) {
+ attr(x, "label") <- as.vector(label)
+ x
+}
+
+common_var_labels <- c(
+ USUBJID = "Unique Subject Identifier",
+ STUDYID = "Study Identifier",
+ PARAM = "Parameter",
+ PARAMCD = "Parameter Code",
+ AVISIT = "Analysis Visit",
+ AVISITN = "Analysis Visit (N)",
+ AVAL = "Analysis Value",
+ AVALU = "Analysis Value Unit",
+ AVALC = "Character Result/Finding",
+ BASE = "Baseline Value",
+ BASE2 = "Screening Value",
+ ABLFL = "Baseline Record Flag",
+ ABLFL2 = "Screening Record Flag",
+ CHG = "Absolute Change from Baseline",
+ PCHG = "Percentage Change from Baseline",
+ ANRIND = "Analysis Reference Range Indicator",
+ BNRIND = "Baseline Reference Range Indicator",
+ ANRLO = "Analysis Normal Range Lower Limit",
+ ANRHI = "Analysis Normal Range Upper Limit",
+ CNSR = "Censor",
+ ADTM = "Analysis Datetime",
+ ADY = "Analysis Relative Day",
+ ASTDY = "Analysis Start Relative Day",
+ AENDY = "Analysis End Relative Day",
+ ASTDTM = "Analysis Start Datetime",
+ AENDTM = "Analysis End Datetime",
+ VISITDY = "Planned Study Day of Visit",
+ EVNTDESC = "Event or Censoring Description",
+ CNSDTDSC = "Censor Date Description",
+ BASETYPE = "Baseline Type",
+ DTYPE = "Derivation Type",
+ ONTRTFL = "On Treatment Record Flag",
+ WORS01FL = "Worst Observation in Window Flag 01",
+ WORS02FL = "Worst Post-Baseline Observation"
+)
ADSL
+
+generate_adsl <- function(N = 200) {
+ set.seed(1)
+ sys_dtm <- lubridate::fast_strptime("20/2/2019 11:16:16.683", "%d/%m/%Y %H:%M:%OS", tz = "UTC")
+ country_site_prob <- c(.5, .121, .077, .077, .075, .052, .046, .025, .014, .003)
+
+ adsl <- tibble::tibble(
+ STUDYID = rep("AB12345", N) %>% with_label("Study Identifier"),
+ COUNTRY = sample_fct(
+ c("CHN", "USA", "BRA", "PAK", "NGA", "RUS", "JPN", "GBR", "CAN", "CHE"),
+ N,
+ prob = country_site_prob
+ ) %>% with_label("Country"),
+ SITEID = sample_fct(1:20, N, prob = rep(country_site_prob, times = 2)),
+ SUBJID = paste("id", seq_len(N), sep = "-") %>% with_label("Subject Identifier for the Study"),
+ AGE = (sapply(stats::rchisq(N, df = 5, ncp = 10), max, 0) + 20) %>% with_label("Age"),
+ SEX = c("F", "M") %>% sample_fct(N, prob = c(.52, .48)) %>% with_label("Sex"),
+ ARMCD = c("ARM A", "ARM B", "ARM C") %>% sample_fct(N) %>% with_label("Planned Arm Code"),
+ ARM = dplyr::recode(
+ .data$ARMCD,
+ "ARM A" = "A: Drug X", "ARM B" = "B: Placebo", "ARM C" = "C: Combination"
+ ) %>% with_label("Description of Planned Arm"),
+ ACTARMCD = .data$ARMCD %>% with_label("Actual Arm Code"),
+ ACTARM = .data$ARM %>% with_label("Description of Actual Arm"),
+ RACE = c(
+ "ASIAN", "BLACK OR AFRICAN AMERICAN", "WHITE", "AMERICAN INDIAN OR ALASKA NATIVE",
+ "MULTIPLE", "NATIVE HAWAIIAN OR OTHER PACIFIC ISLANDER", "OTHER", "UNKNOWN"
+ ) %>%
+ sample_fct(N, prob = c(.55, .23, .16, .05, .004, .003, .002, .002)) %>%
+ with_label("Race"),
+ TRTSDTM = sys_dtm + sample(seq(0, study_duration_secs), size = N, replace = TRUE) %>%
+ with_label("Datetime of First Exposure to Treatment"),
+ TRTEDTM = c(TRTSDTM + study_duration_secs) %>%
+ with_label("Datetime of Last Exposure to Treatment"),
+ EOSDY = ceiling(as.numeric(difftime(TRTEDTM, TRTSDTM, units = "days"))) %>%
+ with_label("End of Study Relative Day"),
+ EOSDT = lubridate::date(TRTEDTM) %>% with_label("End of Study Date"),
+ STRATA1 = c("A", "B", "C") %>% sample_fct(N) %>% with_label("Stratification Factor 1"),
+ STRATA2 = c("S1", "S2") %>% sample_fct(N) %>% with_label("Stratification Factor 2"),
+ BMRKR1 = stats::rchisq(N, 6) %>% with_label("Continuous Level Biomarker 1"),
+ BMRKR2 = sample_fct(c("LOW", "MEDIUM", "HIGH"), N) %>% with_label("Continuous Level Biomarker 2")
+ )
+
+ # associate sites with countries and regions
+ adsl <- adsl %>%
+ dplyr::mutate(
+ SITEID = paste0(.data$COUNTRY, "-", .data$SITEID) %>% with_label("Study Site Identifier"),
+ REGION1 = factor(dplyr::case_when(
+ COUNTRY %in% c("NGA") ~ "Africa",
+ COUNTRY %in% c("CHN", "JPN", "PAK") ~ "Asia",
+ COUNTRY %in% c("RUS") ~ "Eurasia",
+ COUNTRY %in% c("GBR") ~ "Europe",
+ COUNTRY %in% c("CAN", "USA") ~ "North America",
+ COUNTRY %in% c("BRA") ~ "South America",
+ TRUE ~ as.character(NA)
+ )) %>% with_label("Geographic Region 1"),
+ SAFFL = factor("Y") %>% with_label("Safety Population Flag")
+ ) %>%
+ dplyr::mutate(
+ USUBJID = paste(.data$STUDYID, .data$SITEID, .data$SUBJID, sep = "-") %>%
+ with_label("Unique Subject Identifier")
+ )
+
+ # disposition related variables
+ # using probability of 1 for the "DEATH" level to ensure at least one death record exists
+ l_dcsreas <- list(
+ choices = c(
+ "ADVERSE EVENT", "DEATH", "LACK OF EFFICACY", "PHYSICIAN DECISION",
+ "PROTOCOL VIOLATION", "WITHDRAWAL BY PARENT/GUARDIAN", "WITHDRAWAL BY SUBJECT"
+ ),
+ prob = c(.2, 1, .1, .1, .2, .1, .1)
+ )
+ l_dthcat_other <- list(
+ choices = c(
+ "Post-study reporting of death", "LOST TO FOLLOW UP", "MISSING", "SUICIDE", "UNKNOWN"
+ ),
+ prob = c(.1, .3, .3, .2, .1)
+ )
+
+ adsl <- adsl %>%
+ dplyr::mutate(
+ EOSSTT = dplyr::case_when(
+ EOSDY == max(EOSDY, na.rm = TRUE) ~ "COMPLETED",
+ EOSDY < max(EOSDY, na.rm = TRUE) ~ "DISCONTINUED",
+ is.na(TRTEDTM) ~ "ONGOING"
+ ) %>% with_label("End of Study Status")
+ ) %>%
+ dplyr::mutate(
+ EOTSTT = .data$EOSSTT %>% with_label("End of Treatment Status")
+ ) %>%
+ dplyr::mutate(
+ DCSREAS = ifelse(
+ .data$EOSSTT == "DISCONTINUED",
+ sample(x = l_dcsreas$choices, size = N, replace = TRUE, prob = l_dcsreas$prob),
+ as.character(NA)
+ ) %>% with_label("Reason for Discontinuation from Study")
+ )
+
+ tmc_ex_adsl <- adsl %>%
+ dplyr::mutate(DTHDT = dplyr::case_when(
+ DCSREAS == "DEATH" ~ lubridate::date(TRTEDTM + lubridate::days(sample(0:50, size = N, replace = TRUE)))
+ ) %>% with_label("Date of Death"))
+
+ save(tmc_ex_adsl, file = "data/tmc_ex_adsl.rda", compress = "xz")
+}
ADAE
+
+generate_adae <- function(adsl = tmc_ex_adsl,
+ max_n_aes = 5) {
+ set.seed(1)
+ lookup_ae <- tibble::tribble(
+ ~AEBODSYS, ~AELLT, ~AEDECOD, ~AEHLT, ~AEHLGT, ~AETOXGR, ~AESOC, ~AESER, ~AEREL,
+ "cl A.1", "llt A.1.1.1.1", "dcd A.1.1.1.1", "hlt A.1.1.1", "hlgt A.1.1", "1", "cl A", "N", "N",
+ "cl A.1", "llt A.1.1.1.2", "dcd A.1.1.1.2", "hlt A.1.1.1", "hlgt A.1.1", "2", "cl A", "Y", "N",
+ "cl B.1", "llt B.1.1.1.1", "dcd B.1.1.1.1", "hlt B.1.1.1", "hlgt B.1.1", "5", "cl B", "Y", "Y",
+ "cl B.2", "llt B.2.1.2.1", "dcd B.2.1.2.1", "hlt B.2.1.2", "hlgt B.2.1", "3", "cl B", "N", "N",
+ "cl B.2", "llt B.2.2.3.1", "dcd B.2.2.3.1", "hlt B.2.2.3", "hlgt B.2.2", "1", "cl B", "Y", "N",
+ "cl C.1", "llt C.1.1.1.3", "dcd C.1.1.1.3", "hlt C.1.1.1", "hlgt C.1.1", "4", "cl C", "N", "Y",
+ "cl C.2", "llt C.2.1.2.1", "dcd C.2.1.2.1", "hlt C.2.1.2", "hlgt C.2.1", "2", "cl C", "N", "Y",
+ "cl D.1", "llt D.1.1.1.1", "dcd D.1.1.1.1", "hlt D.1.1.1", "hlgt D.1.1", "5", "cl D", "Y", "Y",
+ "cl D.1", "llt D.1.1.4.2", "dcd D.1.1.4.2", "hlt D.1.1.4", "hlgt D.1.1", "3", "cl D", "N", "N",
+ "cl D.2", "llt D.2.1.5.3", "dcd D.2.1.5.3", "hlt D.2.1.5", "hlgt D.2.1", "1", "cl D", "N", "Y"
+ )
+
+ aag <- utils::read.table(
+ sep = ",", header = TRUE,
+ text = paste(
+ "NAMVAR,SRCVAR,GRPTYPE,REFNAME,REFTERM,SCOPE",
+ "CQ01NAM,AEDECOD,CUSTOM,D.2.1.5.3/A.1.1.1.1 aesi,dcd D.2.1.5.3,",
+ "CQ01NAM,AEDECOD,CUSTOM,D.2.1.5.3/A.1.1.1.1 aesi,dcd A.1.1.1.1,",
+ "SMQ01NAM,AEDECOD,SMQ,C.1.1.1.3/B.2.2.3.1 aesi,dcd C.1.1.1.3,BROAD",
+ "SMQ01NAM,AEDECOD,SMQ,C.1.1.1.3/B.2.2.3.1 aesi,dcd B.2.2.3.1,BROAD",
+ "SMQ02NAM,AEDECOD,SMQ,Y.9.9.9.9/Z.9.9.9.9 aesi,dcd Y.9.9.9.9,NARROW",
+ "SMQ02NAM,AEDECOD,SMQ,Y.9.9.9.9/Z.9.9.9.9 aesi,dcd Z.9.9.9.9,NARROW",
+ sep = "\n"
+ ), stringsAsFactors = FALSE
+ )
+
+ adae <- Map(
+ function(id, sid) {
+ n_aes <- sample(c(0, seq_len(max_n_aes)), 1)
+ i <- sample(seq_len(nrow(lookup_ae)), n_aes, TRUE)
+ dplyr::mutate(
+ lookup_ae[i, ],
+ USUBJID = id,
+ STUDYID = sid
+ )
+ },
+ adsl$USUBJID,
+ adsl$STUDYID
+ ) %>%
+ Reduce(rbind, .) %>%
+ `[`(c(10, 11, 1, 2, 3, 4, 5, 6, 7, 8, 9)) %>%
+ dplyr::mutate(
+ AETERM = gsub("dcd", "trm", .data$AEDECOD) %>% with_label("Reported Term for the Adverse Event"),
+ AESEV = dplyr::case_when(
+ AETOXGR == 1 ~ "MILD",
+ AETOXGR %in% c(2, 3) ~ "MODERATE",
+ AETOXGR %in% c(4, 5) ~ "SEVERE"
+ ) %>% with_label("Severity/Intensity")
+ )
+
+ # merge adsl to be able to add AE date and study day variables
+ adae <- dplyr::inner_join(adae, adsl, by = c("STUDYID", "USUBJID"), multiple = "all") %>%
+ dplyr::rowwise() %>%
+ dplyr::mutate(TRTENDT = lubridate::date(dplyr::case_when(
+ is.na(TRTEDTM) ~ lubridate::floor_date(lubridate::date(TRTSDTM) + study_duration_secs, unit = "day"),
+ TRUE ~ TRTEDTM
+ ))) %>%
+ dplyr::mutate(ASTDTM = sample(
+ seq(lubridate::as_datetime(TRTSDTM), lubridate::as_datetime(TRTENDT), by = "day"),
+ size = 1
+ )) %>%
+ dplyr::mutate(ASTDY = ceiling(difftime(ASTDTM, TRTSDTM, units = "days"))) %>%
+ # add 1 to end of range incase both values passed to sample() are the same
+ dplyr::mutate(AENDTM = sample(
+ seq(lubridate::as_datetime(ASTDTM), lubridate::as_datetime(TRTENDT + 1), by = "day"),
+ size = 1
+ )) %>%
+ dplyr::mutate(AENDY = ceiling(difftime(AENDTM, TRTSDTM, units = "days"))) %>%
+ dplyr::mutate(LDOSEDTM = dplyr::case_when(
+ TRTSDTM < ASTDTM ~ lubridate::as_datetime(stats::runif(1, TRTSDTM, ASTDTM)),
+ TRUE ~ ASTDTM
+ )) %>%
+ dplyr::select(-TRTENDT) %>%
+ dplyr::ungroup() %>%
+ dplyr::arrange(.data$STUDYID, .data$USUBJID, .data$ASTDTM, .data$AETERM)
+
+ adae <- adae %>%
+ dplyr::group_by(.data$USUBJID) %>%
+ dplyr::mutate(AESEQ = seq_len(dplyr::n())) %>%
+ dplyr::ungroup() %>%
+ dplyr::arrange(
+ .data$STUDYID,
+ .data$USUBJID,
+ .data$ASTDTM,
+ .data$AETERM,
+ .data$AESEQ
+ )
+
+ outcomes <- c(
+ "UNKNOWN",
+ "NOT RECOVERED/NOT RESOLVED",
+ "RECOVERED/RESOLVED WITH SEQUELAE",
+ "RECOVERING/RESOLVING",
+ "RECOVERED/RESOLVED"
+ )
+
+ adae <- adae %>%
+ dplyr::mutate(
+ AEOUT = factor(ifelse(
+ .data$AETOXGR == "5",
+ "FATAL",
+ as.character(sample_fct(outcomes, nrow(adae), prob = c(0.1, 0.2, 0.1, 0.3, 0.3)))
+ )) %>% with_label("Outcome of Adverse Event"),
+ TRTEMFL = ifelse(.data$ASTDTM >= .data$TRTSDTM, "Y", "") %>%
+ with_label("Treatment Emergent Analysis Flag")
+ )
+
+ l_aag <- split(aag, interaction(aag$NAMVAR, aag$SRCVAR, aag$GRPTYPE, drop = TRUE))
+
+ # Create aesi flags
+ l_aesi <- lapply(l_aag, function(d_adag, d_adae) {
+ names(d_adag)[names(d_adag) == "REFTERM"] <- d_adag$SRCVAR[1]
+ names(d_adag)[names(d_adag) == "REFNAME"] <- d_adag$NAMVAR[1]
+
+ if (d_adag$GRPTYPE[1] == "CUSTOM") {
+ d_adag <- d_adag[-which(names(d_adag) == "SCOPE")]
+ } else if (d_adag$GRPTYPE[1] == "SMQ") {
+ names(d_adag)[names(d_adag) == "SCOPE"] <- paste0(substr(d_adag$NAMVAR[1], 1, 5), "SC")
+ }
+
+ d_adag <- d_adag[-which(names(d_adag) %in% c("NAMVAR", "SRCVAR", "GRPTYPE"))]
+ d_new <- dplyr::left_join(x = d_adae, y = d_adag, by = intersect(names(d_adae), names(d_adag)))
+ d_new[, dplyr::setdiff(names(d_new), names(d_adae)), drop = FALSE]
+ }, adae)
+ adae <- dplyr::bind_cols(adae, l_aesi)
+
+ actions <- c(
+ "DOSE RATE REDUCED",
+ "UNKNOWN",
+ "NOT APPLICABLE",
+ "DRUG INTERRUPTED",
+ "DRUG WITHDRAWN",
+ "DOSE INCREASED",
+ "DOSE NOT CHANGED",
+ "DOSE REDUCED",
+ "NOT EVALUABLE"
+ )
+
+ tmc_ex_adae <- adae %>%
+ dplyr::mutate(
+ AEACN = factor(ifelse(
+ .data$AETOXGR == "5",
+ "NOT EVALUABLE",
+ as.character(sample_fct(actions, nrow(adae), prob = c(0.05, 0.05, 0.05, 0.01, 0.05, 0.1, 0.45, 0.1, 0.05)))
+ )) %>% with_label("Action Taken With Study Treatment")
+ ) %>%
+ col_relabel(
+ AEBODSYS = "Body System or Organ Class",
+ AELLT = "Lowest Level Term",
+ AEDECOD = "Dictionary-Derived Term",
+ AEHLT = "High Level Term",
+ AEHLGT = "High Level Group Term",
+ AETOXGR = "Analysis Toxicity Grade",
+ AESOC = "Primary System Organ Class",
+ AESER = "Serious Event",
+ AEREL = "Analysis Causality",
+ AESEQ = "Sponsor-Defined Identifier",
+ LDOSEDTM = "End Time/Time of Last Dose",
+ CQ01NAM = "CQ 01 Reference Name",
+ SMQ01NAM = "SMQ 01 Reference Name",
+ SMQ01SC = "SMQ 01 Scope",
+ SMQ02NAM = "SMQ 02 Reference Name",
+ SMQ02SC = "SMQ 02 Scope"
+ )
+
+ i_lbls <- sapply(
+ names(col_labels(tmc_ex_adae)[is.na(col_labels(tmc_ex_adae))]), function(x) which(names(common_var_labels) == x)
+ )
+ col_labels(tmc_ex_adae[names(i_lbls)]) <- common_var_labels[i_lbls]
+
+ save(tmc_ex_adae, file = "data/tmc_ex_adae.rda", compress = "xz")
+}
ADAETTE
+
+generate_adaette <- function(adsl = tmc_ex_adsl) {
+ set.seed(1)
+ lookup_adaette <- tibble::tribble(
+ ~ARM, ~CATCD, ~CAT, ~LAMBDA, ~CNSR_P,
+ "ARM A", "1", "any adverse event", 1 / 80, 0.4,
+ "ARM B", "1", "any adverse event", 1 / 100, 0.2,
+ "ARM C", "1", "any adverse event", 1 / 60, 0.42,
+ "ARM A", "2", "any serious adverse event", 1 / 100, 0.3,
+ "ARM B", "2", "any serious adverse event", 1 / 150, 0.1,
+ "ARM C", "2", "any serious adverse event", 1 / 80, 0.32,
+ "ARM A", "3", "a grade 3-5 adverse event", 1 / 80, 0.2,
+ "ARM B", "3", "a grade 3-5 adverse event", 1 / 100, 0.08,
+ "ARM C", "3", "a grade 3-5 adverse event", 1 / 60, 0.23
+ )
+ evntdescr_sel <- "Preferred Term"
+ cnsdtdscr_sel <- c(
+ "Clinical Cut Off",
+ "Completion or Discontinuation",
+ "End of AE Reporting Period"
+ )
+
+ random_patient_data <- function(patient_info) {
+ startdt <- lubridate::date(patient_info$TRTSDTM)
+ trtedtm <- lubridate::floor_date(dplyr::case_when(
+ is.na(patient_info$TRTEDTM) ~ lubridate::date(patient_info$TRTSDTM) + study_duration_secs,
+ TRUE ~ lubridate::date(patient_info$TRTEDTM)
+ ), unit = "day")
+ enddts <- c(patient_info$EOSDT, lubridate::date(trtedtm))
+ enddts_min_index <- which.min(enddts)
+ adt <- enddts[enddts_min_index]
+ adtm <- lubridate::as_datetime(adt)
+ ady <- as.numeric(adt - startdt + 1)
+ data.frame(
+ ARM = patient_info$ARM,
+ STUDYID = patient_info$STUDYID,
+ SITEID = patient_info$SITEID,
+ USUBJID = patient_info$USUBJID,
+ PARAMCD = "AEREPTTE",
+ PARAM = "Time to end of AE reporting period",
+ CNSR = 0,
+ AVAL = lubridate::days(ady) / lubridate::years(1),
+ AVALU = "YEARS",
+ EVNTDESC = ifelse(enddts_min_index == 1, "Completion or Discontinuation", "End of AE Reporting Period"),
+ CNSDTDSC = NA,
+ ADTM = adtm,
+ ADY = ady,
+ stringsAsFactors = FALSE
+ )
+ }
+
+ paramcd_hy <- c("HYSTTEUL", "HYSTTEBL")
+ param_hy <- c("Time to Hy's Law Elevation in relation to ULN", "Time to Hy's Law Elevation in relation to Baseline")
+ param_init_list <- relvar_init(param_hy, paramcd_hy)
+ adsl_hy <- dplyr::select(adsl, "STUDYID", "USUBJID", "TRTSDTM", "SITEID", "ARM")
+ adaette_hy <- expand.grid(
+ STUDYID = unique(adsl$STUDYID),
+ USUBJID = adsl$USUBJID,
+ PARAM = as.factor(param_init_list$relvar1),
+ stringsAsFactors = FALSE
+ )
+
+ adaette_hy <- dplyr::left_join(adaette_hy, adsl_hy, by = c("STUDYID", "USUBJID"), multiple = "all") %>%
+ dplyr::mutate(
+ PARAMCD = factor(rel_var(
+ df = as.data.frame(adaette_hy),
+ var_values = param_init_list$relvar2,
+ related_var = "PARAM"
+ ))
+ ) %>%
+ dplyr::mutate(
+ CNSR = sample(c(0, 1), prob = c(0.1, 0.9), size = dplyr::n(), replace = TRUE),
+ EVNTDESC = dplyr::if_else(
+ .data$CNSR == 0,
+ "First Post-Baseline Raised ALT or AST Elevation Result",
+ NA_character_
+ ),
+ CNSDTDSC = dplyr::if_else(.data$CNSR == 0, NA_character_,
+ sample(c("Last Post-Baseline ALT or AST Result", "Treatment Start"),
+ prob = c(0.9, 0.1),
+ size = dplyr::n(), replace = TRUE
+ )
+ )
+ ) %>%
+ dplyr::rowwise() %>%
+ dplyr::mutate(ADTM = dplyr::case_when(
+ CNSDTDSC == "Treatment Start" ~ TRTSDTM,
+ TRUE ~ TRTSDTM + sample(seq(0, study_duration_secs), size = dplyr::n(), replace = TRUE)
+ )) %>%
+ dplyr::mutate(
+ ADY_int = lubridate::date(ADTM) - lubridate::date(TRTSDTM) + 1,
+ ADY = as.numeric(ADY_int),
+ AVAL = lubridate::days(ADY_int) / lubridate::weeks(1),
+ AVALU = "WEEKS"
+ ) %>%
+ dplyr::select(-TRTSDTM, -ADY_int)
+
+ random_ae_data <- function(lookup_info, patient_info, patient_data) {
+ cnsr <- sample(c(0, 1), 1, prob = c(1 - lookup_info$CNSR_P, lookup_info$CNSR_P))
+ ae_rep_tte <- patient_data$AVAL[patient_data$PARAMCD == "AEREPTTE"]
+ data.frame(
+ ARM = rep(patient_data$ARM, 2),
+ STUDYID = rep(patient_data$STUDYID, 2),
+ SITEID = rep(patient_data$SITEID, 2),
+ USUBJID = rep(patient_data$USUBJID, 2),
+ PARAMCD = c(
+ paste0("AETTE", lookup_info$CATCD),
+ paste0("AETOT", lookup_info$CATCD)
+ ),
+ PARAM = c(
+ paste("Time to first occurrence of", lookup_info$CAT),
+ paste("Number of occurrences of", lookup_info$CAT)
+ ),
+ CNSR = c(cnsr, NA),
+ AVAL = c(
+ ifelse(cnsr == 1, ae_rep_tte, rtexp(1, lookup_info$LAMBDA * 365.25, r = ae_rep_tte)),
+ ifelse(cnsr == 1, 0, rtpois(1, lookup_info$LAMBDA * 365.25))
+ ),
+ AVALU = c("YEARS", NA),
+ EVNTDESC = c(ifelse(cnsr == 0, sample(evntdescr_sel, 1), ""), NA),
+ CNSDTDSC = c(ifelse(cnsr == 1, sample(cnsdtdscr_sel, 1), ""), NA),
+ stringsAsFactors = FALSE
+ ) %>% dplyr::mutate(
+ ADY = dplyr::if_else(is.na(AVALU), NA_real_, ceiling(as.numeric(lubridate::dyears(AVAL), "days"))),
+ ADTM = dplyr::if_else(
+ is.na(AVALU),
+ lubridate::as_datetime(NA),
+ patient_info$TRTSDTM + lubridate::days(ADY)
+ )
+ )
+ }
+
+ adaette <- split(adsl, adsl$USUBJID) %>%
+ lapply(function(patient_info) {
+ patient_data <- random_patient_data(patient_info)
+ lookup_arm <- lookup_adaette %>%
+ dplyr::filter(.data$ARM == as.character(patient_info$ARMCD))
+ ae_data <- split(lookup_arm, lookup_arm$CATCD) %>%
+ lapply(random_ae_data, patient_data = patient_data, patient_info = patient_info) %>%
+ Reduce(rbind, .)
+ dplyr::bind_rows(patient_data, ae_data)
+ }) %>%
+ Reduce(rbind, .)
+ adaette <- rbind(adaette, adaette_hy)
+
+ tmc_ex_adaette <- adsl %>%
+ dplyr::inner_join(
+ dplyr::select(adaette, -"SITEID", -"ARM"),
+ by = c("STUDYID", "USUBJID"),
+ multiple = "all"
+ ) %>%
+ dplyr::group_by(.data$USUBJID) %>%
+ dplyr::arrange(.data$ADTM) %>%
+ dplyr::mutate(PARAM = as.factor(.data$PARAM)) %>%
+ dplyr::mutate(PARAMCD = as.factor(.data$PARAMCD)) %>%
+ dplyr::ungroup() %>%
+ dplyr::arrange(
+ .data$STUDYID,
+ .data$USUBJID,
+ .data$PARAMCD,
+ .data$ADTM
+ )
+
+ i_lbls <- sapply(
+ names(col_labels(tmc_ex_adaette)[is.na(col_labels(tmc_ex_adaette))]),
+ function(x) which(names(common_var_labels) == x)
+ )
+ col_labels(tmc_ex_adaette[names(i_lbls)]) <- common_var_labels[i_lbls]
+
+ save(tmc_ex_adaette, file = "data/tmc_ex_adaette.rda", compress = "xz")
+}
ADCM
+
+generate_adcm <- function(adsl = tmc_ex_adsl,
+ max_n_cms = 5L) {
+ set.seed(1)
+ lookup_cm <- tibble::tribble(
+ ~CMCLAS, ~CMDECOD, ~ATIREL,
+ "medcl A", "medname A_1/3", "PRIOR",
+ "medcl A", "medname A_2/3", "CONCOMITANT",
+ "medcl A", "medname A_3/3", "CONCOMITANT",
+ "medcl B", "medname B_1/4", "CONCOMITANT",
+ "medcl B", "medname B_2/4", "PRIOR",
+ "medcl B", "medname B_3/4", "PRIOR",
+ "medcl B", "medname B_4/4", "CONCOMITANT",
+ "medcl C", "medname C_1/2", "CONCOMITANT",
+ "medcl C", "medname C_2/2", "CONCOMITANT"
+ )
+
+ adcm <- Map(function(id, sid) {
+ n_cms <- sample(c(0, seq_len(max_n_cms)), 1)
+ i <- sample(seq_len(nrow(lookup_cm)), n_cms, TRUE)
+ dplyr::mutate(
+ lookup_cm[i, ],
+ USUBJID = id,
+ STUDYID = sid
+ )
+ }, adsl$USUBJID, adsl$STUDYID) %>%
+ Reduce(rbind, .) %>%
+ `[`(c(4, 5, 1, 2, 3)) %>%
+ dplyr::mutate(CMCAT = .data$CMCLAS %>% with_label("Category for Medication"))
+
+ # merge adsl to be able to add CM date and study day variables
+ adcm <- dplyr::inner_join(
+ adcm,
+ adsl,
+ by = c("STUDYID", "USUBJID"),
+ multiple = "all"
+ ) %>%
+ dplyr::rowwise() %>%
+ dplyr::mutate(TRTENDT = lubridate::date(dplyr::case_when(
+ is.na(TRTEDTM) ~ lubridate::floor_date(lubridate::date(TRTSDTM) + study_duration_secs, unit = "day"),
+ TRUE ~ TRTEDTM
+ ))) %>%
+ dplyr::mutate(ASTDTM = sample(
+ seq(lubridate::as_datetime(TRTSDTM), lubridate::as_datetime(TRTENDT), by = "day"),
+ size = 1
+ )) %>%
+ dplyr::mutate(ASTDY = ceiling(difftime(ASTDTM, TRTSDTM, units = "days"))) %>%
+ # add 1 to end of range incase both values passed to sample() are the same
+ dplyr::mutate(AENDTM = sample(
+ seq(lubridate::as_datetime(ASTDTM), lubridate::as_datetime(TRTENDT + 1), by = "day"),
+ size = 1
+ )) %>%
+ dplyr::mutate(AENDY = ceiling(difftime(AENDTM, TRTSDTM, units = "days"))) %>%
+ dplyr::select(-TRTENDT) %>%
+ dplyr::ungroup() %>%
+ dplyr::arrange(STUDYID, USUBJID, ASTDTM)
+
+ tmc_ex_adcm <- adcm %>%
+ dplyr::group_by(.data$USUBJID) %>%
+ dplyr::mutate(CMSEQ = seq_len(dplyr::n())) %>%
+ dplyr::ungroup() %>%
+ dplyr::arrange(.data$STUDYID, .data$USUBJID, .data$ASTDTM, .data$CMSEQ) %>%
+ dplyr::mutate(
+ ATC1 = paste("ATCCLAS1", substr(.data$CMDECOD, 9, 9)) %>% with_label("ATC Level 1 Text"),
+ ATC2 = paste("ATCCLAS2", substr(.data$CMDECOD, 9, 9)) %>% with_label("ATC Level 2 Text"),
+ ATC3 = paste("ATCCLAS3", substr(.data$CMDECOD, 9, 9)) %>% with_label("ATC Level 3 Text"),
+ ATC4 = paste("ATCCLAS4", substr(.data$CMDECOD, 9, 9)) %>% with_label("ATC Level 4 Text")
+ ) %>%
+ dplyr::mutate(
+ CMINDC = sample(c(
+ "Nausea", "Hypertension", "Urticaria", "Fever",
+ "Asthma", "Infection", "Diabete", "Diarrhea", "Pneumonia"
+ ), dplyr::n(), replace = TRUE) %>% with_label("Indication"),
+ CMDOSE = sample(1:99, dplyr::n(), replace = TRUE) %>% with_label("Dose per Administration"),
+ CMTRT = substr(.data$CMDECOD, 9, 13) %>% with_label("Reported Name of Drug, Med, or Therapy"),
+ CMDOSU = sample(c(
+ "ug/mL", "ug/kg/day", "%", "uL", "DROP",
+ "umol/L", "mg", "mg/breath", "ug"
+ ), dplyr::n(), replace = TRUE) %>% with_label("Dose Units")
+ ) %>%
+ dplyr::mutate(
+ CMROUTE = sample(c(
+ "INTRAVENOUS", "ORAL", "NASAL",
+ "INTRAMUSCULAR", "SUBCUTANEOUS", "INHALED", "RECTAL", "UNKNOWN"
+ ), dplyr::n(), replace = TRUE) %>% with_label("Route of Administration"),
+ CMDOSFRQ = sample(c(
+ "Q4W", "QN", "Q4H", "UNKNOWN", "TWICE",
+ "Q4H", "QD", "TID", "4 TIMES PER MONTH"
+ ), dplyr::n(), replace = TRUE) %>% with_label("Dosing Frequency per Interval")
+ ) %>%
+ col_relabel(
+ CMCLAS = "Medication Class",
+ CMDECOD = "Standardized Medication Name",
+ ATIREL = "Time Relation of Medication",
+ CMSEQ = "Sponsor-Defined Identifier"
+ )
+
+ i_lbls <- sapply(
+ names(col_labels(tmc_ex_adcm)[is.na(col_labels(tmc_ex_adcm))]), function(x) which(names(common_var_labels) == x)
+ )
+ col_labels(tmc_ex_adcm[names(i_lbls)]) <- common_var_labels[i_lbls]
+
+ save(tmc_ex_adcm, file = "data/tmc_ex_adcm.rda", compress = "xz")
+}
ADEG
+
+generate_adeg <- function(adsl = tmc_ex_adsl,
+ n_assessments = 3L,
+ n_days = 3L,
+ max_n_eg = 3L) {
+ set.seed(1)
+ param <- c("QT Duration", "RR Duration", "Heart Rate", "ECG Interpretation")
+ paramcd <- c("QT", "RR", "HR", "ECGINTP")
+ paramu <- c("msec", "msec", "beats/min", "")
+ visit_format <- "WEEK"
+
+ param_init_list <- relvar_init(param, paramcd)
+ unit_init_list <- relvar_init(param, paramu)
+
+ adeg <- expand.grid(
+ STUDYID = unique(adsl$STUDYID),
+ USUBJID = adsl$USUBJID,
+ PARAM = as.factor(param_init_list$relvar1),
+ AVISIT = visit_schedule(visit_format = visit_format, n_assessments = n_assessments, n_days = n_days),
+ stringsAsFactors = FALSE
+ )
+
+ adeg$PARAMCD <- as.factor(rel_var(
+ df = adeg,
+ var_name = "PARAMCD",
+ var_values = param_init_list$relvar2,
+ related_var = "PARAM"
+ ))
+
+ adeg <- adeg %>% dplyr::mutate(AVAL = dplyr::case_when(
+ .data$PARAMCD == "QT" ~ stats::rnorm(nrow(adeg), mean = 350, sd = 100),
+ .data$PARAMCD == "RR" ~ stats::rnorm(nrow(adeg), mean = 1050, sd = 300),
+ .data$PARAMCD == "HR" ~ stats::rnorm(nrow(adeg), mean = 70, sd = 20),
+ .data$PARAMCD == "ECGINTP" ~ NA_real_
+ ))
+
+ adeg <- adeg %>% dplyr::mutate(AVISITN = dplyr::case_when(
+ AVISIT == "SCREENING" ~ -1,
+ AVISIT == "BASELINE" ~ 0,
+ (grepl("^WEEK", AVISIT) | grepl("^CYCLE", AVISIT)) ~ as.numeric(AVISIT) - 2,
+ TRUE ~ NA_real_
+ ))
+
+ adeg$AVALU <- as.factor(rel_var(
+ df = adeg,
+ var_name = "AVALU",
+ var_values = unit_init_list$relvar2,
+ related_var = "PARAM"
+ ))
+
+ adeg <- adeg[order(adeg$STUDYID, adeg$USUBJID, adeg$PARAMCD, adeg$AVISITN), ]
+ adeg <- Reduce(rbind, lapply(split(adeg, adeg$USUBJID), function(x) {
+ x$STUDYID <- adsl$STUDYID[which(adsl$USUBJID == x$USUBJID[1])]
+ x$ABLFL <- ifelse(toupper(visit_format) == "WEEK" & x$AVISIT == "BASELINE",
+ "Y",
+ ifelse(toupper(visit_format) == "CYCLE" & x$AVISIT == "CYCLE 1 DAY 1", "Y", "")
+ )
+ x
+ }))
+
+ adeg$BASE <- ifelse(adeg$AVISITN >= 0, retain(adeg, adeg$AVAL, adeg$ABLFL == "Y"), adeg$AVAL)
+ adeg <- adeg %>%
+ dplyr::mutate(ANRLO = dplyr::case_when(
+ .data$PARAMCD == "QT" ~ 200,
+ .data$PARAMCD == "RR" ~ 600,
+ .data$PARAMCD == "HR" ~ 40,
+ .data$PARAMCD == "ECGINTP" ~ NA_real_
+ )) %>%
+ dplyr::mutate(ANRHI = dplyr::case_when(
+ .data$PARAMCD == "QT" ~ 500,
+ .data$PARAMCD == "RR" ~ 1500,
+ .data$PARAMCD == "HR" ~ 100,
+ .data$PARAMCD == "ECGINTP" ~ NA_real_
+ )) %>%
+ dplyr::mutate(ANRIND = factor(dplyr::case_when(
+ .data$AVAL < .data$ANRLO ~ "LOW",
+ .data$AVAL >= .data$ANRLO & .data$AVAL <= .data$ANRHI ~ "NORMAL",
+ .data$AVAL > .data$ANRHI ~ "HIGH"
+ )))
+
+ adeg <- adeg %>%
+ dplyr::mutate(CHG = ifelse(.data$AVISITN > 0, .data$AVAL - .data$BASE, NA)) %>%
+ dplyr::mutate(PCHG = ifelse(.data$AVISITN > 0, 100 * (.data$CHG / .data$BASE), NA)) %>%
+ dplyr::mutate(BASETYPE = "LAST") %>%
+ dplyr::group_by(.data$USUBJID, .data$PARAMCD, .data$BASETYPE) %>%
+ dplyr::mutate(BNRIND = .data$ANRIND[.data$ABLFL == "Y"]) %>%
+ dplyr::ungroup() %>%
+ dplyr::mutate(DTYPE = NA)
+
+ adeg$ANRIND <- factor(adeg$ANRIND, levels = c("LOW", "NORMAL", "HIGH"))
+ adeg$BNRIND <- factor(adeg$BNRIND, levels = c("LOW", "NORMAL", "HIGH"))
+
+ adeg <- dplyr::inner_join(
+ adsl,
+ adeg,
+ by = c("STUDYID", "USUBJID"),
+ multiple = "all"
+ ) %>%
+ dplyr::rowwise() %>%
+ dplyr::mutate(TRTENDT = lubridate::date(dplyr::case_when(
+ is.na(TRTEDTM) ~ lubridate::floor_date(lubridate::date(TRTSDTM) + study_duration_secs, unit = "day"),
+ TRUE ~ TRTEDTM
+ ))) %>%
+ dplyr::ungroup() %>%
+ dplyr::group_by(USUBJID) %>%
+ dplyr::arrange(USUBJID, AVISITN) %>%
+ dplyr::mutate(ADTM = rep(
+ sort(sample(
+ seq(lubridate::as_datetime(TRTSDTM[1]), lubridate::as_datetime(TRTENDT[1]), by = "day"),
+ size = nlevels(AVISIT)
+ )),
+ each = n() / nlevels(AVISIT)
+ )) %>%
+ dplyr::ungroup() %>%
+ dplyr::select(-TRTENDT) %>%
+ dplyr::ungroup() %>%
+ dplyr::arrange(.data$STUDYID, .data$USUBJID, .data$ADTM)
+
+ adeg <- adeg %>%
+ dplyr::group_by(.data$USUBJID) %>%
+ dplyr::ungroup() %>%
+ dplyr::arrange(
+ .data$STUDYID,
+ .data$USUBJID,
+ .data$PARAMCD,
+ .data$BASETYPE,
+ .data$AVISITN,
+ .data$DTYPE,
+ .data$ADTM
+ )
+
+ adeg <- adeg %>%
+ dplyr::mutate(ONTRTFL = factor(dplyr::case_when(
+ is.na(.data$TRTSDTM) ~ "",
+ is.na(.data$ADTM) ~ "Y",
+ (.data$ADTM < .data$TRTSDTM) ~ "",
+ (.data$ADTM > .data$TRTEDTM) ~ "",
+ TRUE ~ "Y"
+ ))) %>%
+ dplyr::mutate(AVALC = ifelse(
+ .data$PARAMCD == "ECGINTP",
+ as.character(sample_fct(c("ABNORMAL", "NORMAL"), nrow(adeg), prob = c(0.25, 0.75))),
+ as.character(.data$AVAL)
+ ))
+
+ adeg <- adeg %>% dplyr::mutate(row_check = seq_len(nrow(adeg)))
+ get_groups <- function(data, minimum) {
+ data <- data %>%
+ dplyr::group_by(.data$USUBJID, .data$PARAMCD, .data$BASETYPE) %>%
+ dplyr::arrange(.data$ADTM) %>%
+ dplyr::filter(
+ (.data$AVISIT != "BASELINE" & .data$AVISIT != "SCREENING") &
+ (.data$ONTRTFL == "Y" | .data$ADTM <= .data$TRTSDTM)
+ ) %>%
+ {
+ if (minimum == TRUE) {
+ dplyr::filter(., .data$AVAL == min(.data$AVAL)) %>%
+ dplyr::mutate(., DTYPE = "MINIMUM", AVISIT = "POST-BASELINE MINIMUM")
+ } else {
+ dplyr::filter(., .data$AVAL == max(.data$AVAL)) %>%
+ dplyr::mutate(., DTYPE = "MAXIMUM", AVISIT = "POST-BASELINE MAXIMUM")
+ }
+ } %>%
+ dplyr::slice(1) %>%
+ dplyr::ungroup()
+ return(data)
+ }
+
+ lbls <- col_labels(adeg)
+ adeg <- rbind(adeg, get_groups(adeg, TRUE), get_groups(adeg, FALSE)) %>%
+ dplyr::arrange(.data$row_check) %>%
+ dplyr::group_by(.data$USUBJID, .data$PARAMCD, .data$BASETYPE) %>%
+ dplyr::arrange(.data$AVISIT, .by_group = TRUE) %>%
+ dplyr::ungroup()
+ col_labels(adeg) <- lbls
+
+ adeg <- adeg[, -which(names(adeg) %in% c("row_check"))]
+ flag_variables <- function(data, worst_obs) {
+ data_compare <- data %>%
+ dplyr::mutate(row_check = seq_len(nrow(data)))
+ data <- data_compare %>%
+ {
+ if (worst_obs == FALSE) {
+ dplyr::group_by(., .data$USUBJID, .data$PARAMCD, .data$BASETYPE, .data$AVISIT) %>%
+ dplyr::arrange(., .data$ADTM)
+ } else {
+ dplyr::group_by(., .data$USUBJID, .data$PARAMCD, .data$BASETYPE)
+ }
+ } %>%
+ dplyr::filter(
+ .data$AVISITN > 0 & (.data$ONTRTFL == "Y" | .data$ADTM <= .data$TRTSDTM) &
+ is.na(.data$DTYPE)
+ ) %>%
+ {
+ if (worst_obs == TRUE) {
+ dplyr::arrange(., .data$AVALC) %>% dplyr::filter(., ifelse(
+ .data$PARAMCD == "ECGINTP",
+ ifelse(.data$AVALC == "ABNORMAL", .data$AVALC == "ABNORMAL", .data$AVALC == "NORMAL"),
+ .data$AVAL == min(.data$AVAL)
+ ))
+ } else {
+ dplyr::filter(., ifelse(
+ .data$PARAMCD == "ECGINTP",
+ .data$AVALC == "ABNORMAL" | .data$AVALC == "NORMAL",
+ .data$AVAL == min(.data$AVAL)
+ ))
+ }
+ } %>%
+ dplyr::slice(1) %>%
+ {
+ if (worst_obs == TRUE) {
+ dplyr::mutate(., new_var = dplyr::case_when(
+ (.data$AVALC == "ABNORMAL" | .data$AVALC == "NORMAL") ~ "Y",
+ (!is.na(.data$AVAL) & is.na(.data$DTYPE)) ~ "Y",
+ TRUE ~ ""
+ ))
+ } else {
+ dplyr::mutate(., new_var = dplyr::case_when(
+ (.data$AVALC == "ABNORMAL" | .data$AVALC == "NORMAL") ~ "Y",
+ (!is.na(.data$AVAL) & is.na(.data$DTYPE)) ~ "Y",
+ TRUE ~ ""
+ ))
+ }
+ } %>%
+ dplyr::ungroup()
+
+ data_compare$new_var <- ifelse(data_compare$row_check %in% data$row_check, "Y", "")
+ data_compare <- data_compare[, -which(names(data_compare) %in% c("row_check"))]
+
+ return(data_compare)
+ }
+ adeg <- flag_variables(adeg, FALSE) %>% dplyr::rename(WORS01FL = "new_var")
+ adeg <- flag_variables(adeg, TRUE) %>% dplyr::rename(WORS02FL = "new_var")
+
+ tmc_ex_adeg <- adeg %>%
+ dplyr::group_by(.data$USUBJID, .data$PARAMCD, .data$BASETYPE) %>%
+ dplyr::mutate(BASEC = ifelse(
+ .data$PARAMCD == "ECGINTP",
+ .data$AVALC[.data$AVISIT == "BASELINE"],
+ as.character(.data$BASE)
+ )) %>%
+ dplyr::ungroup() %>%
+ col_relabel(BASEC = "Baseline Character Value")
+
+ i_lbls <- sapply(
+ names(col_labels(tmc_ex_adeg)[is.na(col_labels(tmc_ex_adeg))]), function(x) which(names(common_var_labels) == x)
+ )
+ col_labels(tmc_ex_adeg[names(i_lbls)]) <- common_var_labels[i_lbls]
+
+ save(tmc_ex_adeg, file = "data/tmc_ex_adeg.rda", compress = "xz")
+}
ADEX
+
+generate_adex <- function(adsl = tmc_ex_adsl,
+ n_assessments = 3L,
+ n_days = 3L,
+ max_n_exs = 3L) {
+ set.seed(1)
+ param <- c(
+ "Dose administered during constant dosing interval",
+ "Number of doses administered during constant dosing interval",
+ "Total dose administered",
+ "Total number of doses administered"
+ )
+ paramcd <- c("DOSE", "NDOSE", "TDOSE", "TNDOSE")
+ paramu <- c("mg", " ", "mg", " ")
+ parcat1 <- c("INDIVIDUAL", "OVERALL")
+ parcat2 <- c("Drug A", "Drug B")
+ visit_format <- "WEEK"
+
+ param_init_list <- relvar_init(param, paramcd)
+ unit_init_list <- relvar_init(param, paramu)
+
+ adex <- expand.grid(
+ STUDYID = unique(adsl$STUDYID),
+ USUBJID = adsl$USUBJID,
+ PARAM = c(
+ rep(
+ param_init_list$relvar1[1],
+ length(levels(visit_schedule(visit_format = visit_format, n_assessments = n_assessments, n_days = n_days)))
+ ),
+ rep(
+ param_init_list$relvar1[2],
+ length(levels(visit_schedule(visit_format = visit_format, n_assessments = n_assessments, n_days = n_days)))
+ ),
+ param_init_list$relvar1[3:4]
+ ),
+ stringsAsFactors = FALSE
+ )
+
+ adex$PARAMCD <- as.factor(rel_var(
+ df = adex,
+ var_name = "PARAMCD",
+ var_values = param_init_list$relvar2,
+ related_var = "PARAM"
+ ))
+
+ adex$AVALU <- as.factor(rel_var(
+ df = adex,
+ var_name = "AVALU",
+ var_values = unit_init_list$relvar2,
+ related_var = "PARAM"
+ ))
+
+ adex <- adex %>%
+ dplyr::group_by(.data$USUBJID) %>%
+ dplyr::mutate(PARCAT_ind = sample(c(1, 2), size = 1)) %>%
+ dplyr::mutate(PARCAT2 = ifelse(.data$PARCAT_ind == 1, parcat2[1], parcat2[2])) %>%
+ dplyr::select(-"PARCAT_ind")
+
+ adex <- adex %>% dplyr::mutate(PARCAT1 = dplyr::case_when(
+ (.data$PARAMCD == "TNDOSE" | .data$PARAMCD == "TDOSE") ~ "OVERALL",
+ .data$PARAMCD == "DOSE" | .data$PARAMCD == "NDOSE" ~ "INDIVIDUAL"
+ ))
+
+ adex_visit <- adex %>%
+ dplyr::filter(.data$PARAMCD == "DOSE" | .data$PARAMCD == "NDOSE") %>%
+ dplyr::mutate(
+ AVISIT = rep(visit_schedule(visit_format = visit_format, n_assessments = n_assessments, n_days = n_days), 2)
+ )
+
+ adex <- dplyr::left_join(
+ adex %>%
+ dplyr::group_by(
+ .data$USUBJID,
+ .data$STUDYID,
+ .data$PARAM,
+ .data$PARAMCD,
+ .data$AVALU,
+ .data$PARCAT1,
+ .data$PARCAT2
+ ) %>%
+ dplyr::mutate(id = dplyr::row_number()),
+ adex_visit %>%
+ dplyr::group_by(
+ .data$USUBJID,
+ .data$STUDYID,
+ .data$PARAM,
+ .data$PARAMCD,
+ .data$AVALU,
+ .data$PARCAT1,
+ .data$PARCAT2
+ ) %>%
+ dplyr::mutate(id = dplyr::row_number()),
+ by = c("USUBJID", "STUDYID", "PARCAT1", "PARCAT2", "id", "PARAMCD", "PARAM", "AVALU")
+ ) %>%
+ dplyr::select(-"id")
+
+ adex <- adex %>% dplyr::mutate(AVISITN = dplyr::case_when(
+ AVISIT == "SCREENING" ~ -1,
+ AVISIT == "BASELINE" ~ 0,
+ (grepl("^WEEK", AVISIT) | grepl("^CYCLE", AVISIT)) ~ as.numeric(AVISIT) - 2,
+ TRUE ~ 999000
+ ))
+
+ adex2 <- split(adex, adex$USUBJID) %>%
+ lapply(function(pinfo) {
+ pinfo %>%
+ dplyr::filter(.data$PARAMCD == "DOSE") %>%
+ dplyr::group_by(.data$USUBJID, .data$PARCAT2, .data$AVISIT) %>%
+ dplyr::mutate(changeind = dplyr::case_when(
+ .data$AVISIT == "SCREENING" ~ 0,
+ .data$AVISIT != "SCREENING" ~ sample(c(-1, 0, 1),
+ size = 1,
+ prob = c(0.25, 0.5, 0.25),
+ replace = TRUE
+ )
+ )) %>%
+ dplyr::ungroup() %>%
+ dplyr::group_by(.data$USUBJID, .data$PARCAT2) %>%
+ dplyr::mutate(
+ csum = cumsum(.data$changeind),
+ changeind = dplyr::case_when(
+ .data$csum <= -3 ~ sample(c(0, 1), size = 1, prob = c(0.5, 0.5)),
+ .data$csum >= 3 ~ sample(c(0, -1), size = 1, prob = c(0.5, 0.5)),
+ TRUE ~ .data$changeind
+ )
+ ) %>%
+ dplyr::mutate(csum = cumsum(.data$changeind)) %>%
+ dplyr::ungroup() %>%
+ dplyr::group_by(.data$USUBJID, .data$PARCAT2, .data$AVISIT) %>%
+ dplyr::mutate(AVAL = dplyr::case_when(
+ .data$csum == -2 ~ 480,
+ .data$csum == -1 ~ 720,
+ .data$csum == 0 ~ 960,
+ .data$csum == 1 ~ 1200,
+ .data$csum == 2 ~ 1440
+ )) %>%
+ dplyr::select(-c("csum", "changeind")) %>%
+ dplyr::ungroup()
+ }) %>%
+ Reduce(rbind, .)
+
+ adextmp <- dplyr::full_join(adex2, adex, by = names(adex))
+ adex <- adextmp %>%
+ dplyr::group_by(.data$USUBJID) %>%
+ dplyr::mutate(AVAL = ifelse(.data$PARAMCD == "NDOSE", 1, .data$AVAL)) %>%
+ dplyr::mutate(AVAL = ifelse(
+ .data$PARAMCD == "TNDOSE",
+ sum(.data$AVAL[.data$PARAMCD == "NDOSE"]),
+ .data$AVAL
+ )) %>%
+ dplyr::ungroup() %>%
+ dplyr::group_by(.data$USUBJID, .data$STUDYID, .data$PARCAT2) %>%
+ dplyr::mutate(AVAL = ifelse(
+ .data$PARAMCD == "TDOSE",
+ sum(.data$AVAL[.data$PARAMCD == "DOSE"]),
+ .data$AVAL
+ ))
+
+ adex <- dplyr::inner_join(adsl, adex, by = c("STUDYID", "USUBJID"), multiple = "all") %>%
+ dplyr::rowwise() %>%
+ dplyr::mutate(TRTENDT = lubridate::date(dplyr::case_when(
+ is.na(TRTEDTM) ~ lubridate::floor_date(lubridate::date(TRTSDTM) + study_duration_secs, unit = "day"),
+ TRUE ~ TRTEDTM
+ ))) %>%
+ dplyr::mutate(ASTDTM = sample(
+ seq(lubridate::as_datetime(TRTSDTM), lubridate::as_datetime(TRTENDT), by = "day"),
+ size = 1
+ )) %>%
+ dplyr::select(-TRTENDT) %>%
+ dplyr::ungroup() %>%
+ dplyr::arrange(.data$STUDYID, .data$USUBJID, .data$ASTDTM)
+
+ adex <- adex %>%
+ dplyr::group_by(.data$USUBJID) %>%
+ dplyr::mutate(EXSEQ = seq_len(dplyr::n())) %>%
+ dplyr::ungroup() %>%
+ dplyr::arrange(
+ .data$STUDYID,
+ .data$USUBJID,
+ .data$PARAMCD,
+ .data$ASTDTM,
+ .data$AVISITN
+ ) %>%
+ col_relabel(
+ PARCAT1 = "Parameter Category (Individual/Overall)",
+ PARCAT2 = "Parameter Category (Drug A/Drug B)",
+ EXSEQ = "Analysis Sequence Number"
+ )
+
+ visit_levels <- str_extract(levels(adex$AVISIT), pattern = "[0-9]+")
+ vl_extracted <- vapply(visit_levels, function(x) as.numeric(x[2]), numeric(1))
+ vl_extracted <- c(-1, 1, vl_extracted[!is.na(vl_extracted)])
+
+ tmc_ex_adex <- adex %>%
+ dplyr::mutate(VISITDY = as.numeric(as.character(factor(AVISIT, labels = vl_extracted)))) %>%
+ dplyr::mutate(ASTDTM = lubridate::as_datetime(TRTSDTM) + lubridate::days(VISITDY)) %>%
+ dplyr::distinct(USUBJID, .keep_all = TRUE)
+
+ i_lbls <- sapply(
+ names(col_labels(tmc_ex_adex)[is.na(col_labels(tmc_ex_adex))]), function(x) which(names(common_var_labels) == x)
+ )
+ col_labels(tmc_ex_adex[names(i_lbls)]) <- common_var_labels[i_lbls]
+
+ save(tmc_ex_adex, file = "data/tmc_ex_adex.rda", compress = "xz")
+}
ADLB
+
+generate_adlb <- function(adsl = tmc_ex_adsl,
+ n_assessments = 3L,
+ n_days = 3L,
+ max_n_lbs = 3L) {
+ set.seed(1)
+ lbcat <- c("CHEMISTRY", "CHEMISTRY", "IMMUNOLOGY")
+ param <- c(
+ "Alanine Aminotransferase Measurement",
+ "C-Reactive Protein Measurement",
+ "Immunoglobulin A Measurement"
+ )
+ paramcd <- c("ALT", "CRP", "IGA")
+ paramu <- c("U/L", "mg/L", "g/L")
+ aval_mean <- c(20, 1, 2)
+ visit_format <- "WEEK"
+
+ # validate and initialize related variables
+ lbcat_init_list <- relvar_init(param, lbcat)
+ param_init_list <- relvar_init(param, paramcd)
+ unit_init_list <- relvar_init(param, paramu)
+
+ adlb <- expand.grid(
+ STUDYID = unique(adsl$STUDYID),
+ USUBJID = adsl$USUBJID,
+ PARAM = as.factor(param_init_list$relvar1),
+ AVISIT = visit_schedule(visit_format = visit_format, n_assessments = n_assessments, n_days = n_days),
+ stringsAsFactors = FALSE
+ )
+
+ # assign AVAL based on different test
+ adlb <- adlb %>%
+ dplyr::mutate(AVAL = stats::rnorm(nrow(adlb), mean = 1, sd = 0.2)) %>%
+ dplyr::left_join(data.frame(PARAM = param, ADJUST = aval_mean), by = "PARAM") %>%
+ dplyr::mutate(AVAL = .data$AVAL * .data$ADJUST) %>%
+ dplyr::select(-"ADJUST")
+
+ # assign related variable values: PARAMxLBCAT are related
+ adlb$LBCAT <- as.factor(rel_var(
+ df = adlb,
+ var_name = "LBCAT",
+ var_values = lbcat_init_list$relvar2,
+ related_var = "PARAM"
+ ))
+
+ # assign related variable values: PARAMxPARAMCD are related
+ adlb$PARAMCD <- as.factor(rel_var(
+ df = adlb,
+ var_name = "PARAMCD",
+ var_values = param_init_list$relvar2,
+ related_var = "PARAM"
+ ))
+
+ adlb$AVALU <- as.factor(rel_var(
+ df = adlb,
+ var_name = "AVALU",
+ var_values = unit_init_list$relvar2,
+ related_var = "PARAM"
+ ))
+
+ adlb <- adlb %>% dplyr::mutate(AVISITN = dplyr::case_when(
+ AVISIT == "SCREENING" ~ -1,
+ AVISIT == "BASELINE" ~ 0,
+ (grepl("^WEEK", AVISIT) | grepl("^CYCLE", AVISIT)) ~ as.numeric(AVISIT) - 2,
+ TRUE ~ NA_real_
+ ))
+
+ adlb <- adlb %>%
+ dplyr::mutate(AVISITN = dplyr::case_when(
+ AVISIT == "SCREENING" ~ -1,
+ AVISIT == "BASELINE" ~ 0,
+ (grepl("^WEEK", AVISIT) | grepl("^CYCLE", AVISIT)) ~ as.numeric(AVISIT) - 2,
+ TRUE ~ NA_real_
+ ))
+
+ # order to prepare for change from screening and baseline values
+ adlb <- adlb[order(adlb$STUDYID, adlb$USUBJID, adlb$PARAMCD, adlb$AVISITN), ]
+
+ adlb <- Reduce(rbind, lapply(split(adlb, adlb$USUBJID), function(x) {
+ x$STUDYID <- adsl$STUDYID[which(adsl$USUBJID == x$USUBJID[1])]
+ x$ABLFL2 <- ifelse(x$AVISIT == "SCREENING", "Y", "")
+ x$ABLFL <- ifelse(toupper(visit_format) == "WEEK" & x$AVISIT == "BASELINE",
+ "Y",
+ ifelse(toupper(visit_format) == "CYCLE" & x$AVISIT == "CYCLE 1 DAY 1", "Y", "")
+ )
+ x
+ }))
+
+ adlb$BASE <- ifelse(adlb$ABLFL2 != "Y", retain(adlb, adlb$AVAL, adlb$ABLFL == "Y"), NA)
+ anrind_choices <- c("HIGH", "LOW", "NORMAL")
+ adlb <- adlb %>%
+ dplyr::mutate(BASETYPE = "LAST") %>%
+ dplyr::mutate(ANRIND = sample_fct(anrind_choices, nrow(adlb), prob = c(0.1, 0.1, 0.8))) %>%
+ dplyr::mutate(ANRLO = dplyr::case_when(
+ .data$PARAMCD == "ALT" ~ 7,
+ .data$PARAMCD == "CRP" ~ 8,
+ .data$PARAMCD == "IGA" ~ 0.8
+ )) %>%
+ dplyr::mutate(ANRHI = dplyr::case_when(
+ .data$PARAMCD == "ALT" ~ 55,
+ .data$PARAMCD == "CRP" ~ 10,
+ .data$PARAMCD == "IGA" ~ 3
+ )) %>%
+ dplyr::mutate(DTYPE = NA) %>%
+ dplyr::mutate(
+ ATOXGR = factor(dplyr::case_when(
+ .data$ANRIND == "LOW" ~ sample(
+ c("-1", "-2", "-3", "-4", "-5"),
+ nrow(adlb),
+ replace = TRUE,
+ prob = c(0.30, 0.25, 0.20, 0.15, 0)
+ ),
+ .data$ANRIND == "HIGH" ~ sample(
+ c("1", "2", "3", "4", "5"),
+ nrow(adlb),
+ replace = TRUE,
+ prob = c(0.30, 0.25, 0.20, 0.15, 0)
+ ),
+ .data$ANRIND == "NORMAL" ~ "0"
+ )) %>% with_label("Analysis Toxicity Grade")
+ ) %>%
+ dplyr::group_by(.data$USUBJID, .data$PARAMCD, .data$BASETYPE) %>%
+ dplyr::mutate(BTOXGR = .data$ATOXGR[.data$ABLFL == "Y"]) %>%
+ dplyr::ungroup() %>%
+ col_relabel(BTOXGR = "Baseline Toxicity Grade")
+
+ # High and low descriptions of the different PARAMCD values
+ # This is currently hard coded as the GDSR does not have these descriptions yet
+ grade_lookup <- tibble::tribble(
+ ~PARAMCD, ~ATOXDSCL, ~ATOXDSCH,
+ "ALB", "Hypoalbuminemia", NA_character_,
+ "ALKPH", NA_character_, "Alkaline phosphatase increased",
+ "ALT", NA_character_, "Alanine aminotransferase increased",
+ "AST", NA_character_, "Aspartate aminotransferase increased",
+ "BILI", NA_character_, "Blood bilirubin increased",
+ "CA", "Hypocalcemia", "Hypercalcemia",
+ "CHOLES", NA_character_, "Cholesterol high",
+ "CK", NA_character_, "CPK increased",
+ "CREAT", NA_character_, "Creatinine increased",
+ "CRP", NA_character_, "C reactive protein increased",
+ "GGT", NA_character_, "GGT increased",
+ "GLUC", "Hypoglycemia", "Hyperglycemia",
+ "HGB", "Anemia", "Hemoglobin increased",
+ "IGA", NA_character_, "Immunoglobulin A increased",
+ "POTAS", "Hypokalemia", "Hyperkalemia",
+ "LYMPH", "CD4 lymphocytes decreased", NA_character_,
+ "PHOS", "Hypophosphatemia", NA_character_,
+ "PLAT", "Platelet count decreased", NA_character_,
+ "SODIUM", "Hyponatremia", "Hypernatremia",
+ "WBC", "White blood cell decreased", "Leukocytosis",
+ )
+
+ # merge grade_lookup onto adlb
+ adlb <- dplyr::left_join(adlb, grade_lookup, by = "PARAMCD")
+
+ # merge adsl to be able to add LB date and study day variables
+ adlb <- dplyr::inner_join(
+ adsl,
+ adlb,
+ by = c("STUDYID", "USUBJID"),
+ multiple = "all"
+ ) %>%
+ dplyr::rowwise() %>%
+ dplyr::mutate(TRTENDT = lubridate::date(dplyr::case_when(
+ is.na(TRTEDTM) ~ lubridate::floor_date(lubridate::date(TRTSDTM) + study_duration_secs, unit = "day"),
+ TRUE ~ TRTEDTM
+ ))) %>%
+ dplyr::ungroup() %>%
+ dplyr::group_by(USUBJID) %>%
+ dplyr::arrange(USUBJID, AVISITN) %>%
+ dplyr::mutate(ADTM = rep(
+ sort(sample(
+ seq(lubridate::as_datetime(TRTSDTM[1]), lubridate::as_datetime(TRTENDT[1]), by = "day"),
+ size = nlevels(AVISIT)
+ )),
+ each = n() / nlevels(AVISIT)
+ )) %>%
+ dplyr::ungroup() %>%
+ dplyr::select(-TRTENDT) %>%
+ dplyr::arrange(.data$STUDYID, .data$USUBJID, .data$ADTM)
+
+ adlb <- adlb %>%
+ dplyr::group_by(.data$USUBJID) %>%
+ dplyr::mutate(LBSEQ = seq_len(dplyr::n())) %>%
+ dplyr::ungroup() %>%
+ dplyr::arrange(
+ .data$STUDYID,
+ .data$USUBJID,
+ .data$PARAMCD,
+ .data$BASETYPE,
+ .data$AVISITN,
+ .data$DTYPE,
+ .data$ADTM,
+ .data$LBSEQ
+ ) %>%
+ col_relabel(LBSEQ = "Lab Test or Examination Sequence Number")
+
+ adlb <- adlb %>% dplyr::mutate(ONTRTFL = factor(dplyr::case_when(
+ is.na(.data$TRTSDTM) ~ "",
+ is.na(.data$ADTM) ~ "Y",
+ (.data$ADTM < .data$TRTSDTM) ~ "",
+ (.data$ADTM > .data$TRTEDTM) ~ "",
+ TRUE ~ "Y"
+ )))
+
+ flag_variables <- function(data,
+ apply_grouping,
+ apply_filter,
+ apply_mutate) {
+ data_compare <- data %>% dplyr::mutate(row_check = seq_len(nrow(data)))
+ data <- data_compare %>%
+ {
+ if (apply_grouping == TRUE) {
+ dplyr::group_by(., .data$USUBJID, .data$PARAMCD, .data$BASETYPE, .data$AVISIT)
+ } else {
+ dplyr::group_by(., .data$USUBJID, .data$PARAMCD, .data$BASETYPE)
+ }
+ } %>%
+ dplyr::arrange(.data$ADTM, .data$LBSEQ) %>%
+ {
+ if (apply_filter == TRUE) {
+ dplyr::filter(
+ .,
+ (.data$AVISIT != "BASELINE" & .data$AVISIT != "SCREENING") &
+ (.data$ONTRTFL == "Y" | .data$ADTM <= .data$TRTSDTM)
+ ) %>%
+ dplyr::filter(.data$ATOXGR == max(as.numeric(as.character(.data$ATOXGR))))
+ } else if (apply_filter == FALSE) {
+ dplyr::filter(
+ .,
+ (.data$AVISIT != "BASELINE" & .data$AVISIT != "SCREENING") &
+ (.data$ONTRTFL == "Y" | .data$ADTM <= .data$TRTSDTM)
+ ) %>%
+ dplyr::filter(.data$ATOXGR == min(as.numeric(as.character(.data$ATOXGR))))
+ } else {
+ dplyr::filter(
+ .,
+ .data$AVAL == min(.data$AVAL) &
+ (.data$AVISIT != "BASELINE" & .data$AVISIT != "SCREENING") &
+ (.data$ONTRTFL == "Y" | .data$ADTM <= .data$TRTSDTM)
+ )
+ }
+ } %>%
+ dplyr::slice(1) %>%
+ {
+ if (apply_mutate == TRUE) {
+ dplyr::mutate(., new_var = ifelse(is.na(.data$DTYPE), "Y", ""))
+ } else {
+ dplyr::mutate(., new_var = ifelse(is.na(.data$AVAL) == FALSE & is.na(.data$DTYPE), "Y", ""))
+ }
+ } %>%
+ dplyr::ungroup()
+
+ data_compare$new_var <- ifelse(data_compare$row_check %in% data$row_check, "Y", "")
+ data_compare <- data_compare[, -which(names(data_compare) %in% c("row_check"))]
+ return(data_compare)
+ }
+ adlb <- flag_variables(adlb, TRUE, "ELSE", FALSE) %>% dplyr::rename(WORS01FL = "new_var")
+ adlb <- flag_variables(adlb, FALSE, TRUE, TRUE) %>% dplyr::rename(WGRHIFL = "new_var")
+ adlb <- flag_variables(adlb, FALSE, FALSE, TRUE) %>% dplyr::rename(WGRLOFL = "new_var")
+ adlb <- flag_variables(adlb, TRUE, TRUE, TRUE) %>% dplyr::rename(WGRHIVFL = "new_var")
+ adlb <- flag_variables(adlb, TRUE, FALSE, TRUE) %>% dplyr::rename(WGRLOVFL = "new_var")
+
+ tmc_ex_adlb <- adlb %>% dplyr::mutate(
+ ANL01FL = ifelse(
+ (.data$ABLFL == "Y" | (.data$WORS01FL == "Y" & is.na(.data$DTYPE))) &
+ (.data$AVISIT != "SCREENING"),
+ "Y",
+ ""
+ ) %>% with_label("Analysis Flag 01 Baseline Post-Baseline"),
+ PARAM = as.factor(.data$PARAM)
+ )
+
+ tmc_ex_adlb <- tmc_ex_adlb %>%
+ group_by(.data$USUBJID, .data$PARAMCD, .data$BASETYPE) %>%
+ mutate(BNRIND = .data$ANRIND[.data$ABLFL == "Y"]) %>%
+ ungroup() %>%
+ dplyr::mutate(ADY = ceiling(as.numeric(difftime(.data$ADTM, .data$TRTSDTM, units = "days"))))
+
+ tmc_ex_adlb$PARAMCD <- as.factor(tmc_ex_adlb$PARAMCD)
+ tmc_ex_adlb <- tmc_ex_adlb %>%
+ dplyr::mutate(CHG = .data$AVAL - .data$BASE) %>%
+ dplyr::mutate(PCHG = 100 * (.data$CHG / .data$BASE)) %>%
+ col_relabel(
+ LBCAT = "Category for Lab Test",
+ ATOXDSCL = "Analysis Toxicity Description Low",
+ ATOXDSCH = "Analysis Toxicity Description High",
+ WGRHIFL = "Worst High Grade per Patient",
+ WGRLOFL = "Worst Low Grade per Patient",
+ WGRHIVFL = "Worst High Grade per Patient per Visit",
+ WGRLOVFL = "Worst Low Grade per Patient per Visit"
+ )
+
+ i_lbls <- sapply(
+ names(col_labels(tmc_ex_adlb)[is.na(col_labels(tmc_ex_adlb))]), function(x) which(names(common_var_labels) == x)
+ )
+ col_labels(tmc_ex_adlb[names(i_lbls)]) <- common_var_labels[i_lbls]
+
+ save(tmc_ex_adlb, file = "data/tmc_ex_adlb.rda", compress = "xz")
+}
ADMH
+
+generate_admh <- function(adsl = tmc_ex_adsl,
+ max_n_mhs = 10L) {
+ set.seed(1)
+ lookup_mh <- tibble::tribble(
+ ~MHBODSYS, ~MHDECOD, ~MHSOC,
+ "cl A", "trm A_1/2", "cl A",
+ "cl A", "trm A_2/2", "cl A",
+ "cl B", "trm B_1/3", "cl B",
+ "cl B", "trm B_2/3", "cl B",
+ "cl B", "trm B_3/3", "cl B",
+ "cl C", "trm C_1/2", "cl C",
+ "cl C", "trm C_2/2", "cl C",
+ "cl D", "trm D_1/3", "cl D",
+ "cl D", "trm D_2/3", "cl D",
+ "cl D", "trm D_3/3", "cl D"
+ )
+
+ admh <- Map(
+ function(id, sid) {
+ n_mhs <- sample(0:max_n_mhs, 1)
+ i <- sample(seq_len(nrow(lookup_mh)), n_mhs, TRUE)
+ dplyr::mutate(
+ lookup_mh[i, ],
+ USUBJID = id,
+ STUDYID = sid
+ )
+ },
+ adsl$USUBJID,
+ adsl$STUDYID
+ ) %>%
+ Reduce(rbind, .) %>%
+ `[`(c(4, 5, 1, 2, 3)) %>%
+ dplyr::mutate(MHTERM = .data$MHDECOD %>% with_label("Reported Term for the Medical History"))
+
+ admh <- dplyr::inner_join(
+ adsl,
+ admh,
+ by = c("STUDYID", "USUBJID"),
+ multiple = "all"
+ ) %>%
+ dplyr::rowwise() %>%
+ dplyr::mutate(TRTENDT = lubridate::date(dplyr::case_when(
+ is.na(TRTEDTM) ~ lubridate::floor_date(lubridate::date(TRTSDTM) + study_duration_secs, unit = "day"),
+ TRUE ~ TRTEDTM
+ ))) %>%
+ dplyr::mutate(ASTDTM = sample(
+ seq(lubridate::as_datetime(TRTSDTM), lubridate::as_datetime(TRTENDT), by = "day"),
+ size = 1
+ )) %>%
+ select(-TRTENDT) %>%
+ dplyr::ungroup() %>%
+ dplyr::arrange(.data$STUDYID, .data$USUBJID, .data$ASTDTM, .data$MHTERM) %>%
+ dplyr::mutate(MHDISTAT = sample(
+ x = c("Resolved", "Ongoing with treatment", "Ongoing without treatment"),
+ prob = c(0.6, 0.2, 0.2),
+ size = dplyr::n(),
+ replace = TRUE
+ ) %>% with_label("Status of Disease"))
+
+ tmc_ex_admh <- admh %>%
+ dplyr::group_by(.data$USUBJID) %>%
+ dplyr::mutate(MHSEQ = seq_len(dplyr::n())) %>%
+ dplyr::ungroup() %>%
+ dplyr::arrange(.data$STUDYID, .data$USUBJID, .data$ASTDTM) %>%
+ col_relabel(
+ MHBODSYS = "Body System or Organ Class",
+ MHDECOD = "Dictionary-Derived Term",
+ MHSOC = "Primary System Organ Class",
+ MHSEQ = "Sponsor-Defined Identifier"
+ )
+
+ i_lbls <- sapply(
+ names(col_labels(tmc_ex_admh)[is.na(col_labels(tmc_ex_admh))]), function(x) which(names(common_var_labels) == x)
+ )
+ col_labels(tmc_ex_admh[names(i_lbls)]) <- common_var_labels[i_lbls]
+
+ save(tmc_ex_admh, file = "data/tmc_ex_admh.rda", compress = "xz")
+}
ADQS
+
+generate_adqs <- function(adsl = tmc_ex_adsl,
+ n_assessments = 5L,
+ n_days = 5L) {
+ set.seed(1)
+ param <- c(
+ "BFI All Questions",
+ "Fatigue Interference",
+ "Function/Well-Being (GF1,GF3,GF7)",
+ "Treatment Side Effects (GP2,C5,GP5)",
+ "FKSI-19 All Questions"
+ )
+ paramcd <- c("BFIALL", "FATIGI", "FKSI-FWB", "FKSI-TSE", "FKSIALL")
+ visit_format <- "WEEK"
+
+ param_init_list <- relvar_init(param, paramcd)
+
+ adqs <- expand.grid(
+ STUDYID = unique(adsl$STUDYID),
+ USUBJID = adsl$USUBJID,
+ PARAM = param_init_list$relvar1,
+ AVISIT = visit_schedule(visit_format = visit_format, n_assessments = n_assessments, n_days = n_days),
+ stringsAsFactors = FALSE
+ )
+
+ adqs <- dplyr::mutate(
+ adqs,
+ AVISITN = dplyr::case_when(
+ AVISIT == "SCREENING" ~ -1,
+ AVISIT == "BASELINE" ~ 0,
+ (grepl("^WEEK", AVISIT) | grepl("^CYCLE", AVISIT)) ~ as.numeric(AVISIT) - 2,
+ TRUE ~ NA_real_
+ )
+ )
+
+ adqs$PARAMCD <- rel_var(df = adqs, var_name = "PARAMCD", var_values = param_init_list$relvar2, related_var = "PARAM")
+ adqs$AVAL <- stats::rnorm(nrow(adqs), mean = 50, sd = 8) + adqs$AVISITN * stats::rnorm(nrow(adqs), mean = 5, sd = 2)
+ adqs <- adqs[order(adqs$STUDYID, adqs$USUBJID, adqs$PARAMCD, adqs$AVISITN), ]
+
+ adqs <- Reduce(
+ rbind,
+ lapply(
+ split(adqs, adqs$USUBJID),
+ function(x) {
+ x$STUDYID <- adsl$STUDYID[which(adsl$USUBJID == x$USUBJID[1])]
+ x$ABLFL2 <- ifelse(x$AVISIT == "SCREENING", "Y", "")
+ x$ABLFL <- ifelse(
+ toupper(visit_format) == "WEEK" & x$AVISIT == "BASELINE",
+ "Y",
+ ifelse(
+ toupper(visit_format) == "CYCLE" & x$AVISIT == "CYCLE 1 DAY 1",
+ "Y",
+ ""
+ )
+ )
+ x
+ }
+ )
+ )
+
+ adqs$BASE <- ifelse(adqs$ABLFL2 != "Y", retain(adqs, adqs$AVAL, adqs$ABLFL == "Y"), NA)
+ adqs <- adqs %>% dplyr::mutate(CHG = .data$AVAL - .data$BASE)
+
+ adqs <- dplyr::inner_join(
+ adsl,
+ adqs,
+ by = c("STUDYID", "USUBJID"),
+ multiple = "all"
+ ) %>%
+ dplyr::rowwise() %>%
+ dplyr::mutate(TRTENDT = lubridate::date(dplyr::case_when(
+ is.na(TRTEDTM) ~ lubridate::floor_date(lubridate::date(TRTSDTM) + study_duration_secs, unit = "day"),
+ TRUE ~ TRTEDTM
+ ))) %>%
+ ungroup() %>%
+ group_by(USUBJID) %>%
+ arrange(USUBJID, AVISITN) %>%
+ dplyr::mutate(ADTM = rep(
+ sort(sample(
+ seq(lubridate::as_datetime(TRTSDTM[1]), lubridate::as_datetime(TRTENDT[1]), by = "day"),
+ size = nlevels(AVISIT)
+ )),
+ each = n() / nlevels(AVISIT)
+ )) %>%
+ dplyr::ungroup() %>%
+ dplyr::select(-TRTENDT) %>%
+ dplyr::arrange(.data$STUDYID, .data$USUBJID, .data$ADTM)
+
+ tmc_ex_adqs <- adqs %>%
+ dplyr::group_by(.data$USUBJID) %>%
+ dplyr::ungroup() %>%
+ dplyr::arrange(
+ .data$STUDYID,
+ .data$USUBJID,
+ .data$PARAMCD,
+ .data$AVISITN,
+ .data$ADTM
+ )
+
+ i_lbls <- sapply(
+ names(col_labels(tmc_ex_adqs)[is.na(col_labels(tmc_ex_adqs))]), function(x) which(names(common_var_labels) == x)
+ )
+ col_labels(tmc_ex_adqs[names(i_lbls)]) <- common_var_labels[i_lbls]
+
+ save(tmc_ex_adqs, file = "data/tmc_ex_adqs.rda", compress = "xz")
+}
ADRS
+
+generate_adrs <- function(adsl = tmc_ex_adsl) {
+ set.seed(1)
+ param_codes <- stats::setNames(1:5, c("CR", "PR", "SD", "PD", "NE"))
+
+ lookup_ars <- expand.grid(
+ ARM = c("A: Drug X", "B: Placebo", "C: Combination"),
+ AVALC = names(param_codes)
+ ) %>% dplyr::mutate(
+ AVAL = param_codes[.data$AVALC],
+ p_scr = c(rep(0, 3), rep(0, 3), c(1, 1, 1), c(0, 0, 0), c(0, 0, 0)),
+ p_bsl = c(rep(0, 3), rep(0, 3), c(1, 1, 1), c(0, 0, 0), c(0, 0, 0)),
+ p_cycle = c(c(.35, .25, .4), c(.30, .20, .20), c(.2, .25, .3), c(.14, 0.20, 0.18), c(.01, 0.1, 0.02)),
+ p_eoi = c(c(.35, .25, .4), c(.30, .20, .20), c(.2, .25, .3), c(.14, 0.20, 0.18), c(.01, 0.1, 0.02)),
+ p_fu = c(c(.25, .15, .3), c(.15, .05, .25), c(.3, .25, .3), c(.3, .55, .25), rep(0, 3))
+ )
+
+ adrs <- split(adsl, adsl$USUBJID) %>%
+ lapply(function(pinfo) {
+ probs <- dplyr::filter(lookup_ars, .data$ARM == as.character(pinfo$ACTARM))
+ # screening
+ rsp_screen <- sample(probs$AVALC, 1, prob = probs$p_scr) %>% as.character()
+ # baseline
+ rsp_bsl <- sample(probs$AVALC, 1, prob = probs$p_bsl) %>% as.character()
+ # cycle
+ rsp_c2d1 <- sample(probs$AVALC, 1, prob = probs$p_cycle) %>% as.character()
+ rsp_c4d1 <- sample(probs$AVALC, 1, prob = probs$p_cycle) %>% as.character()
+ # end of induction
+ rsp_eoi <- sample(probs$AVALC, 1, prob = probs$p_eoi) %>% as.character()
+ # follow up
+ rsp_fu <- sample(probs$AVALC, 1, prob = probs$p_fu) %>% as.character()
+
+ best_rsp <- min(param_codes[c(rsp_screen, rsp_bsl, rsp_eoi, rsp_fu, rsp_c2d1, rsp_c4d1)])
+ best_rsp_i <- which.min(param_codes[c(rsp_screen, rsp_bsl, rsp_eoi, rsp_fu, rsp_c2d1, rsp_c4d1)])
+
+ avisit <- c("SCREENING", "BASELINE", "CYCLE 2 DAY 1", "CYCLE 4 DAY 1", "END OF INDUCTION", "FOLLOW UP")
+
+ # meaningful date information
+ TRTSTDT <- lubridate::date(pinfo$TRTSDTM)
+ TRTENDT <- lubridate::date(dplyr::if_else(
+ !is.na(pinfo$TRTEDTM), pinfo$TRTEDTM,
+ lubridate::floor_date(TRTSTDT + study_duration_secs, unit = "day")
+ ))
+ scr_date <- TRTSTDT - lubridate::days(100)
+ bs_date <- TRTSTDT
+ flu_date <- sample(seq(lubridate::as_datetime(TRTSTDT), lubridate::as_datetime(TRTENDT), by = "day"), size = 1)
+ eoi_date <- sample(seq(lubridate::as_datetime(TRTSTDT), lubridate::as_datetime(TRTENDT), by = "day"), size = 1)
+ c2d1_date <- sample(seq(lubridate::as_datetime(TRTSTDT), lubridate::as_datetime(TRTENDT), by = "day"), size = 1)
+ c4d1_date <- min(lubridate::date(c2d1_date + lubridate::days(60)), TRTENDT)
+
+ tibble::tibble(
+ STUDYID = pinfo$STUDYID,
+ USUBJID = pinfo$USUBJID,
+ PARAMCD = as.factor(c(rep("OVRINV", 6), "BESRSPI", "INVET")),
+ PARAM = as.factor(dplyr::recode(
+ .data$PARAMCD,
+ OVRINV = "Overall Response by Investigator - by visit",
+ OVRSPI = "Best Overall Response by Investigator (no confirmation required)",
+ BESRSPI = "Best Confirmed Overall Response by Investigator",
+ INVET = "Investigator End Of Induction Response"
+ )),
+ AVALC = c(
+ rsp_screen, rsp_bsl, rsp_c2d1, rsp_c4d1, rsp_eoi, rsp_fu,
+ names(param_codes)[best_rsp],
+ rsp_eoi
+ ),
+ AVAL = param_codes[.data$AVALC],
+ AVISIT = factor(c(avisit, avisit[best_rsp_i], avisit[5]), levels = avisit)
+ ) %>%
+ merge(
+ tibble::tibble(
+ AVISIT = avisit,
+ ADTM = c(scr_date, bs_date, c2d1_date, c4d1_date, eoi_date, flu_date),
+ AVISITN = c(-1, 0, 2, 4, 999, 999),
+ TRTSDTM = pinfo$TRTSDTM
+ ) %>%
+ dplyr::select(-"TRTSDTM"),
+ by = "AVISIT"
+ )
+ }) %>%
+ Reduce(rbind, .) %>%
+ dplyr::mutate(
+ AVALC = factor(.data$AVALC, levels = names(param_codes)),
+ DTHFL = factor(sample(c("Y", "N"), nrow(.), replace = TRUE, prob = c(1, 0.8))) %>%
+ with_label("Death Flag")
+ )
+
+ # merge ADSL to be able to add RS date and study day variables
+ adrs <- dplyr::inner_join(
+ adsl,
+ adrs,
+ by = c("STUDYID", "USUBJID"),
+ multiple = "all"
+ )
+
+ tmc_ex_adrs <- adrs %>%
+ dplyr::arrange(
+ .data$STUDYID,
+ .data$USUBJID,
+ .data$PARAMCD,
+ .data$AVISITN,
+ .data$ADTM
+ )
+
+ i_lbls <- sapply(
+ names(col_labels(tmc_ex_adrs)[is.na(col_labels(tmc_ex_adrs))]), function(x) which(names(common_var_labels) == x)
+ )
+ col_labels(tmc_ex_adrs[names(i_lbls)]) <- common_var_labels[i_lbls]
+
+ save(tmc_ex_adrs, file = "data/tmc_ex_adrs.rda", compress = "xz")
+}
ADTTE
+
+generate_adtte <- function(adsl = tmc_ex_adsl) {
+ set.seed(1)
+ lookup_tte <- tibble::tribble(
+ ~ARM, ~PARAMCD, ~PARAM, ~LAMBDA, ~CNSR_P,
+ "ARM A", "OS", "Overall Survival", log(2) / 610, 0.4,
+ "ARM B", "OS", "Overall Survival", log(2) / 490, 0.3,
+ "ARM C", "OS", "Overall Survival", log(2) / 365, 0.2,
+ "ARM A", "PFS", "Progression Free Survival", log(2) / 365, 0.4,
+ "ARM B", "PFS", "Progression Free Survival", log(2) / 305, 0.3,
+ "ARM C", "PFS", "Progression Free Survival", log(2) / 243, 0.2,
+ "ARM A", "EFS", "Event Free Survival", log(2) / 365, 0.4,
+ "ARM B", "EFS", "Event Free Survival", log(2) / 305, 0.3,
+ "ARM C", "EFS", "Event Free Survival", log(2) / 243, 0.2,
+ "ARM A", "CRSD", "Duration of Confirmed Response", log(2) / 305, 0.4,
+ "ARM B", "CRSD", "Duration of Confirmed Response", log(2) / 243, 0.3,
+ "ARM C", "CRSD", "Duration of Confirmed Response", log(2) / 182, 0.2
+ )
+
+ evntdescr_sel <- c(
+ "Death",
+ "Disease Progression",
+ "Last Tumor Assessment",
+ "Adverse Event",
+ "Last Date Known To Be Alive"
+ )
+
+ cnsdtdscr_sel <- c(
+ "Preferred Term",
+ "Clinical Cut Off",
+ "Completion or Discontinuation",
+ "End of AE Reporting Period"
+ )
+
+ adtte <- split(adsl, adsl$USUBJID) %>%
+ lapply(FUN = function(pinfo) {
+ lookup_tte %>%
+ dplyr::filter(.data$ARM == as.character(pinfo$ACTARMCD)) %>%
+ dplyr::rowwise() %>%
+ dplyr::mutate(
+ STUDYID = pinfo$STUDYID,
+ USUBJID = pinfo$USUBJID,
+ CNSR = sample(c(0, 1), 1, prob = c(1 - .data$CNSR_P, .data$CNSR_P)),
+ AVAL = stats::rexp(1, .data$LAMBDA),
+ AVALU = "DAYS",
+ EVNTDESC = if (.data$CNSR == 1) {
+ sample(evntdescr_sel[-c(1:2)], 1)
+ } else {
+ ifelse(.data$PARAMCD == "OS",
+ sample(evntdescr_sel[1], 1),
+ sample(evntdescr_sel[c(1:2)], 1)
+ )
+ }
+ ) %>%
+ dplyr::select(-"LAMBDA", -"CNSR_P")
+ }) %>%
+ Reduce(rbind, .)
+
+ # merge ADSL to be able to add TTE date and study day variables
+ adtte <- dplyr::inner_join(
+ adsl,
+ dplyr::select(adtte, -"ARM"),
+ by = c("STUDYID", "USUBJID"),
+ multiple = "all"
+ ) %>%
+ dplyr::rowwise() %>%
+ dplyr::mutate(TRTENDT = lubridate::date(dplyr::case_when(
+ is.na(TRTEDTM) ~ lubridate::floor_date(lubridate::date(TRTSDTM) + study_duration_secs, unit = "day"),
+ TRUE ~ TRTEDTM
+ ))) %>%
+ dplyr::mutate(ADTM = sample(
+ seq(lubridate::as_datetime(TRTSDTM), lubridate::as_datetime(TRTENDT), by = "day"),
+ size = 1
+ )) %>%
+ dplyr::select(-TRTENDT) %>%
+ dplyr::ungroup() %>%
+ dplyr::arrange(.data$STUDYID, .data$USUBJID, .data$ADTM)
+
+ adtte <- adtte %>%
+ dplyr::group_by(.data$USUBJID) %>%
+ dplyr::mutate(PARAM = as.factor(.data$PARAM)) %>%
+ dplyr::mutate(PARAMCD = as.factor(.data$PARAMCD)) %>%
+ dplyr::ungroup() %>%
+ dplyr::arrange(
+ .data$STUDYID,
+ .data$USUBJID,
+ .data$PARAMCD,
+ .data$ADTM
+ )
+ lbls <- col_labels(adtte)
+
+ # adding adverse event counts and log follow-up time
+ tmc_ex_adtte <- dplyr::bind_rows(
+ adtte,
+ data.frame(adtte %>%
+ dplyr::group_by(.data$USUBJID) %>%
+ dplyr::slice_head(n = 1) %>%
+ dplyr::mutate(
+ PARAMCD = "TNE",
+ PARAM = "Total Number of Exacerbations",
+ AVAL = stats::rpois(1, 3),
+ AVALU = "COUNT",
+ lgTMATRSK = log(stats::rexp(1, rate = 3)),
+ dplyr::across(c("ADTM", "EVNTDESC"), ~NA)
+ ))
+ ) %>%
+ dplyr::arrange(
+ .data$STUDYID,
+ .data$USUBJID,
+ .data$PARAMCD,
+ .data$ADTM
+ )
+ col_labels(tmc_ex_adtte) <- c(lbls, lgTMATRSK = "Log Time At Risk")
+
+ i_lbls <- sapply(
+ names(col_labels(tmc_ex_adtte)[is.na(col_labels(tmc_ex_adtte))]), function(x) which(names(common_var_labels) == x)
+ )
+ col_labels(tmc_ex_adtte[names(i_lbls)]) <- common_var_labels[i_lbls]
+
+ save(tmc_ex_adtte, file = "data/tmc_ex_adtte.rda", compress = "xz")
+}
ADVS
+
+generate_advs <- function(adsl = tmc_ex_adsl,
+ n_assessments = 5L,
+ n_days = 5L) {
+ set.seed(1)
+ param <- c(
+ "Diastolic Blood Pressure",
+ "Pulse Rate",
+ "Respiratory Rate",
+ "Systolic Blood Pressure",
+ "Temperature", "Weight"
+ )
+ paramcd <- c("DIABP", "PULSE", "RESP", "SYSBP", "TEMP", "WEIGHT")
+ paramu <- c("Pa", "beats/min", "breaths/min", "Pa", "C", "Kg")
+ visit_format <- "WEEK"
+
+ param_init_list <- relvar_init(param, paramcd)
+ unit_init_list <- relvar_init(param, paramu)
+
+ advs <- expand.grid(
+ STUDYID = unique(adsl$STUDYID),
+ USUBJID = adsl$USUBJID,
+ PARAM = as.factor(param_init_list$relvar1),
+ AVISIT = visit_schedule(visit_format = visit_format, n_assessments = n_assessments),
+ stringsAsFactors = FALSE
+ )
+
+ advs <- dplyr::mutate(
+ advs,
+ AVISITN = dplyr::case_when(
+ AVISIT == "SCREENING" ~ -1,
+ AVISIT == "BASELINE" ~ 0,
+ (grepl("^WEEK", AVISIT) | grepl("^CYCLE", AVISIT)) ~ as.numeric(AVISIT) - 2,
+ TRUE ~ NA_real_
+ )
+ )
+
+ advs$PARAMCD <- as.factor(rel_var(
+ df = advs,
+ var_name = "PARAMCD",
+ var_values = param_init_list$relvar2,
+ related_var = "PARAM"
+ ))
+ advs$AVALU <- as.factor(rel_var(
+ df = advs,
+ var_name = "AVALU",
+ var_values = unit_init_list$relvar2,
+ related_var = "PARAM"
+ ))
+
+ advs$AVAL <- stats::rnorm(nrow(advs), mean = 50, sd = 8)
+ advs <- advs[order(advs$STUDYID, advs$USUBJID, advs$PARAMCD, advs$AVISITN), ]
+
+ advs <- dplyr::inner_join(
+ adsl,
+ advs,
+ by = c("STUDYID", "USUBJID"),
+ multiple = "all"
+ ) %>%
+ dplyr::rowwise() %>%
+ dplyr::mutate(TRTENDT = lubridate::date(dplyr::case_when(
+ is.na(TRTEDTM) ~ lubridate::floor_date(lubridate::date(TRTSDTM) + study_duration_secs, unit = "day"),
+ TRUE ~ TRTEDTM
+ ))) %>%
+ dplyr::mutate(ADTM = sample(
+ seq(lubridate::as_datetime(TRTSDTM), lubridate::as_datetime(TRTENDT), by = "day"),
+ size = 1
+ )) %>%
+ dplyr::mutate(ADY = ceiling(difftime(ADTM, TRTSDTM, units = "days"))) %>%
+ dplyr::select(-TRTENDT) %>%
+ dplyr::ungroup() %>%
+ dplyr::arrange(.data$STUDYID, .data$USUBJID, .data$ADTM)
+
+ tmc_ex_advs <- advs %>%
+ dplyr::group_by(.data$USUBJID) %>%
+ dplyr::ungroup() %>%
+ dplyr::arrange(
+ .data$STUDYID,
+ .data$USUBJID,
+ .data$PARAMCD,
+ .data$AVISITN,
+ .data$ADTM
+ )
+
+ i_lbls <- sapply(
+ names(col_labels(tmc_ex_advs)[is.na(col_labels(tmc_ex_advs))]), function(x) which(names(common_var_labels) == x)
+ )
+ col_labels(tmc_ex_advs[names(i_lbls)]) <- common_var_labels[i_lbls]
+
+ save(tmc_ex_advs, file = "data/tmc_ex_advs.rda", compress = "xz")
+}
+# Generate & load adsl
+tmp_fol <- getwd()
+setwd(dirname(tmp_fol))
+generate_adsl()
+load("data/tmc_ex_adsl.rda")
+
+# Generate other datasets
+generate_adae()
+generate_adaette()
+generate_adcm()
+generate_adeg()
+generate_adex()
+generate_adlb()
+generate_admh()
+generate_adqs()
+generate_adrs()
+generate_adtte()
+generate_advs()
+
+setwd(tmp_fol)
vignettes/quickstart_substitute.Rmd
+ quickstart_substitute.Rmd
Considering an expression, R usually evaluates it and returns its
+value. Instead of focusing on the value, it is also possible to work
+with the code which generated the
+value. This is where non standard evaluation, or NSE,
+starts. The function substitute
is an important element of
+non-standard evaluation. For instance, if we consider a
+defined as a <- 5
, then the expression a
+returns 5, and the substitute(a)
returns the code to obtain
+the value: a
.
This is the principle teal
relies on to:
Show R Code
.The expression returning the displayed value must be reactive. The
+information in the encoding on one hand, and the filtering panel on the
+other hand modify the expression and the displayed value. As such,
+teal
needs to work both on expressions and values and
+relies heavily on NSE.
The NSE is an advanced notion and mixing it with Shiny app +development is a source of difficulties such as:
+As an alternative, it is possible to focus first on the NSE aspects
+in plain R, and only once ready, integrate it in the Shiny App. The
+following are a few practical examples demonstrating how NSE works. The
+choice was made to focus on substitute
.
+non_evaluated_expression <- substitute(expr = a + b)
+non_evaluated_expression
+## a + b
+eval(non_evaluated_expression)
+## Error in eval(non_evaluated_expression): object 'b' not found
What happened?
+substitute
returns the code and not the value,a
and b
exist, the
+expression can run without error:
+non_evaluated_expression <- substitute(expr = a + b)
+a <- 1
+b <- 5
+eval(non_evaluated_expression)
+## [1] 6
Now, the function name substitute
is for a reason. Not
+only returning the expression, it also operates
+substitutions of some terms within a given expression.
+fun <- function(a, b) {
+ substitute(expr = a + b)
+}
+non_evaluated_expression <- fun(5, -2)
+non_evaluated_expression
+## 5 + -2
+eval(non_evaluated_expression)
+## [1] 3
What happened?
+a
and b
exist in the function
+environment where substitute
is called.substitute
were
+replaced by the values of a
and b
.Indeed, before returning the expression, substitute
+verifies if a
and b
don’t have any value
+existing in the evaluation environment. If so, values of a
+and b
are used in the expression.
It is also possible to use the second argument of
+substitute
, env
, an environment (or a list)
+containing objects. If the expression submitted in
+substitute
has corresponding objects in env
,
+the terms within the expression will be substituted with provided
+values:
+non_evaluated_expression <- substitute(
+ expr = a + b,
+ env = list(a = 5, b = 5)
+)
+non_evaluated_expression
+## 5 + 5
+eval(non_evaluated_expression)
+## [1] 10
What happened?
+a
and
+b
were taken from was directly declared within the
+substitute
expression (argument expr
) and the
+values were substituted (argument env
).substitute
returned a non-evaluated expression, use
+eval()
to evaluate it.With a slightly more elaborate expression:
+
+non_evaluated_expression <- substitute(
+ expr = plot(x = x, y = exp(x), main = text),
+ env = list(x = 0:10, text = "A graph")
+)
+non_evaluated_expression
+## plot(x = 0:10, y = exp(0:10), main = "A graph")
+eval(non_evaluated_expression)
Note that:
+x
as an argument name in plot has been preserved, while
+x
as an object has been replaced.In formulas, character strings are not accepted, how do we execute +the substitution?
+
+# Error expected:
+plot_expr <- substitute(
+ expr = plot(y ~ x, data = iris, main = text),
+ env = list(
+ x = Sepal.Length,
+ y = Sepal.Width,
+ text = "Iris, again ..."
+ )
+)
+## Error: object 'Sepal.Length' not found
+# Error expected:
+plot_expr <- substitute(
+ expr = plot(y ~ x, data = iris, main = text),
+ env = list(
+ x = "Sepal.Length",
+ y = "Sepal.Width",
+ text = "Iris, again ..."
+ )
+)
+plot_expr
+## plot("Sepal.Width" ~ "Sepal.Length", data = iris, main = "Iris, again ...")
+eval(plot_expr)
+## Error in terms.formula(formula, data = data): invalid term in model formula
The object names have a specific class (name
);
+as.names
coerces a character string to an object name
+(alternatively, as.symbol
provides an identical
+result):
Lets imagine a pipe-flavored expression, with df
being
+the term corresponding to the dataframe which should be substituted:
+df %>% plot(y ~ x, data = ., main = text)
.
The principle exposed above can work directly without addition.
+However, df
in the expression is then replaced directly by
+the value of the object provided and not the expression generating the
+dataframe: the pipeline is working but not humanly readable.
+library(dplyr)
+##
+## Attaching package: 'dplyr'
+## The following objects are masked from 'package:stats':
+##
+## filter, lag
+## The following objects are masked from 'package:base':
+##
+## intersect, setdiff, setequal, union
+
+short_iris <- head(iris)
+plot_expr <- substitute(
+ expr = df %>% plot(y ~ x, data = ., main = text),
+ env = list(
+ df = short_iris,
+ x = as.name("Sepal.Length"),
+ y = as.symbol("Sepal.Width"),
+ text = "Iris, again ..."
+ )
+)
+eval(plot_expr)
+plot_expr
+## list(Sepal.Length = c(5.1, 4.9, 4.7, 4.6, 5, 5.4), Sepal.Width = c(3.5,
+## 3, 3.2, 3.1, 3.6, 3.9), Petal.Length = c(1.4, 1.4, 1.3, 1.5,
+## 1.4, 1.7), Petal.Width = c(0.2, 0.2, 0.2, 0.2, 0.2, 0.4), Species = c(1L,
+## 1L, 1L, 1L, 1L, 1L)) %>% plot(Sepal.Width ~ Sepal.Length, data = .,
+## main = "Iris, again ...")
How can we replace the value by the expression generating this +value?
+That is pretty much the topic of the vignette:
+substitute
.
+plot_expr <- substitute(
+ expr = df %>% plot(y ~ x, data = ., main = text),
+ env = list(
+ df = substitute(iris),
+ x = as.name("Sepal.Length"),
+ y = as.symbol("Sepal.Width"),
+ text = "Iris, again ..."
+ )
+)
+plot_expr
+## iris %>% plot(Sepal.Width ~ Sepal.Length, data = ., main = "Iris, again ...")
+eval(plot_expr)
substitute
is relevant when the expression needs to be
+modified. It takes 2 arguments:
+expr
the expression to be (eventually)
+substituted.env
the environment in which potential replacement
+value might be needed.y ~ x
) then, use as.name
or
+as.symbol
.iris
) then,
+use substitute
.rtables
+substitute
+The substitute
approach can be used with the
+rtables
pipelines.
Lets prepare an example for reporting data from the LB domain. The
+example is based on the template LBT01
; the target is to
+report in columns the lab test result per study arm, as values
+(AVAL
) and changes from baseline (CHG
), per
+analysis visit in rows.
The data can be prepared as follows:
+
+library(teal.modules.clinical)
+library(dplyr)
+
+adlb <- tmc_ex_adlb
+adlb_f <- adlb %>%
+ filter(
+ PARAM == "Alanine Aminotransferase Measurement" &
+ ARMCD %in% c("ARM A", "ARM B") & AVISIT == "WEEK 1 DAY 8"
+ )
And the rtables
expression is obtained as:
+rtables_expr <- substitute(
+ expr = basic_table() %>%
+ split_cols_by(arm, split_fun = drop_split_levels) %>%
+ split_rows_by(visit, split_fun = drop_split_levels) %>%
+ split_cols_by_multivar(
+ vars = c("AVAL", "CHG"),
+ varlabels = c("Value", "Change")
+ ) %>%
+ summarize_colvars() %>%
+ build_table(df = df),
+ env = list(
+ df = substitute(adlb_f),
+ arm = "ARM",
+ visit = "AVISIT"
+ )
+)
The expression is valid … :
+
+eval(rtables_expr)
+## A: Drug X B: Placebo
+## Value Change Value Change
+## ——————————————————————————————————————————————————————————————————————
+## WEEK 1 DAY 8
+## n 69 69 73 73
+## Mean (SD) 20.8 (4.1) 1.6 (6.1) 20.2 (4.1) -0.2 (5.6)
+## Median 20.4 2.4 20.0 -0.2
+## Min - Max 12.8 - 34.6 -11.3 - 14.2 12.6 - 29.0 -12.8 - 10.8
… but not easily readable …:
+
+rtables_expr
+## basic_table() %>% split_cols_by("ARM", split_fun = drop_split_levels) %>%
+## split_rows_by("AVISIT", split_fun = drop_split_levels) %>%
+## split_cols_by_multivar(vars = c("AVAL", "CHG"), varlabels = c("Value",
+## "Change")) %>% summarize_colvars() %>% build_table(df = adlb_f)
… but that can be arranged:
+
+library(teal)
+library(styler)
+
+#' Stylish code
+#'
+#' Deparse an expression and display the code following NEST conventions.
+#'
+#' @param expr (`call`)\cr or possibly understood as so.
+#'
+styled_expr <- function(expr) {
+ print(
+ styler::style_text(text = deparse(expr)),
+ colored = FALSE
+ )
+}
+#'
+#' @examples
+styled_expr(rtables_expr)
+## basic_table() %>%
+## split_cols_by("ARM", split_fun = drop_split_levels) %>%
+## split_rows_by("AVISIT", split_fun = drop_split_levels) %>%
+## split_cols_by_multivar(vars = c("AVAL", "CHG"), varlabels = c(
+## "Value",
+## "Change"
+## )) %>%
+## summarize_colvars() %>%
+## build_table(df = adlb_f)
substitute
in a function
+Moving further, substitute
can actually be wrapped in a
+function, this way the rtables
pipelines are
+programmatically obtained:
+rtables_expr <- function(df,
+ arm,
+ visit) {
+ substitute(
+ expr = basic_table() %>%
+ split_cols_by(arm, split_fun = drop_split_levels) %>%
+ split_rows_by(visit, split_fun = drop_split_levels) %>%
+ split_cols_by_multivar(
+ vars = c("AVAL", "CHG"),
+ varlabels = c("Value", "Change")
+ ) %>%
+ summarize_colvars() %>%
+ build_table(df = df),
+ env = list(
+ df = substitute(df),
+ arm = arm,
+ visit = visit
+ )
+ )
+}
+result <- rtables_expr(df = adlb_f, arm = "ARM", visit = "AVISIT")
+styled_expr(result)
+## basic_table() %>%
+## split_cols_by("ARM", split_fun = drop_split_levels) %>%
+## split_rows_by("AVISIT", split_fun = drop_split_levels) %>%
+## split_cols_by_multivar(vars = c("AVAL", "CHG"), varlabels = c(
+## "Value",
+## "Change"
+## )) %>%
+## summarize_colvars() %>%
+## build_table(df = adlb_f)
+eval(result)
+## A: Drug X B: Placebo
+## Value Change Value Change
+## ——————————————————————————————————————————————————————————————————————
+## WEEK 1 DAY 8
+## n 69 69 73 73
+## Mean (SD) 20.8 (4.1) 1.6 (6.1) 20.2 (4.1) -0.2 (5.6)
+## Median 20.4 2.4 20.0 -0.2
+## Min - Max 12.8 - 34.6 -11.3 - 14.2 12.6 - 29.0 -12.8 - 10.8
teal
module
+encoding panel.
+result <- rtables_expr(df = adlb_f, arm = "ARMCD", visit = "AVISITN")
+eval(result)
+## Split var [AVISITN] was not character or factor. Converting to factor
+## ARM A ARM B
+## Value Change Value Change
+## —————————————————————————————————————————————————————————————————————
+## 1
+## n 69 69 73 73
+## Mean (SD) 20.8 (4.1) 1.6 (6.1) 20.2 (4.1) -0.2 (5.6)
+## Median 20.4 2.4 20.0 -0.2
+## Min - Max 12.8 - 34.6 -11.3 - 14.2 12.6 - 29.0 -12.8 - 10.8
+styled_expr(result)
+## basic_table() %>%
+## split_cols_by("ARMCD", split_fun = drop_split_levels) %>%
+## split_rows_by("AVISITN", split_fun = drop_split_levels) %>%
+## split_cols_by_multivar(vars = c("AVAL", "CHG"), varlabels = c(
+## "Value",
+## "Change"
+## )) %>%
+## summarize_colvars() %>%
+## build_table(df = adlb_f)
It is also possible to manipulate expressions, for instance, +expressions might be chained in a pipeline.
+
+#' Expressions as a pipeline
+#'
+#' Accepts expressions to be chained using the `magrittr` pipeline-flavor.
+#' @param ... (`call`)\cr or object which can be interpreted as so.
+#' (e.g. `name`)
+#'
+pipe_expr <- function(...) {
+ exprs <- unlist(list(...))
+ exprs <- lapply(
+ exprs,
+ function(x) {
+ x <- deparse(x)
+ paste(x, collapse = " ")
+ }
+ )
+ exprs <- unlist(exprs)
+ exprs <- paste(exprs, collapse = " %>% ")
+ str2lang(exprs)
+}
+
+#' @examples
+result <- pipe_expr(
+ expr1 = substitute(df),
+ expr2 = substitute(head)
+)
+result
+## df %>% head
rtables
, layers enclosing
+analyze
call handle .stats
option. The lean
+expression should include the .stats
option, only
+when the default value is changed.teal
module when
+rendering the code with Show R Code
:
+rtables_expr <- function(df,
+ arm,
+ visit,
+ .stats = NULL) {
+ # The rtables layout is decomposed into a list of expressions.
+ lyt <- list()
+ # 1. First the columns and rows:
+ lyt$structure <- substitute(
+ expr = basic_table() %>%
+ split_cols_by(arm, split_fun = drop_split_levels) %>%
+ split_rows_by(visit, split_fun = drop_split_levels) %>%
+ split_cols_by_multivar(
+ vars = c("AVAL", "CHG"),
+ varlabels = c("Value", "Change")
+ ),
+ env = list(
+ arm = arm,
+ visit = visit
+ )
+ )
+ # 2. The analyze layer which depends on the use of .stats.
+ lyt$analyze <- if (is.null(.stats)) {
+ substitute(
+ summarize_colvars()
+ )
+ } else {
+ substitute(
+ summarize_colvars(.stats = .stats),
+ list(.stats = .stats)
+ )
+ }
+ # 3. And finishing with rtables::build_table.
+ lyt$build <- substitute(
+ build_table(df = df),
+ list(df = substitute(df))
+ )
+ # As previously demonstrated, expressions can be manipulated and
+ # chained in a pipeline.
+ pipe_expr(lyt)
+}
+result <- rtables_expr(df = adlb_f, arm = "ARM", visit = "AVISIT")
+styled_expr(result)
+## basic_table() %>%
+## split_cols_by("ARM", split_fun = drop_split_levels) %>%
+## split_rows_by("AVISIT", split_fun = drop_split_levels) %>%
+## split_cols_by_multivar(vars = c("AVAL", "CHG"), varlabels = c(
+## "Value",
+## "Change"
+## )) %>%
+## summarize_colvars() %>%
+## build_table(df = adlb_f)
+eval(result)
+## A: Drug X B: Placebo
+## Value Change Value Change
+## ——————————————————————————————————————————————————————————————————————
+## WEEK 1 DAY 8
+## n 69 69 73 73
+## Mean (SD) 20.8 (4.1) 1.6 (6.1) 20.2 (4.1) -0.2 (5.6)
+## Median 20.4 2.4 20.0 -0.2
+## Min - Max 12.8 - 34.6 -11.3 - 14.2 12.6 - 29.0 -12.8 - 10.8
+result <- rtables_expr(
+ df = adlb_f, arm = "ARM", visit = "AVISIT",
+ .stats = c("n", "mean_sd")
+)
+styled_expr(result)
+## basic_table() %>%
+## split_cols_by("ARM", split_fun = drop_split_levels) %>%
+## split_rows_by("AVISIT", split_fun = drop_split_levels) %>%
+## split_cols_by_multivar(vars = c("AVAL", "CHG"), varlabels = c(
+## "Value",
+## "Change"
+## )) %>%
+## summarize_colvars(.stats = c("n", "mean_sd")) %>%
+## build_table(df = adlb_f)
+eval(result)
+## A: Drug X B: Placebo
+## Value Change Value Change
+## ———————————————————————————————————————————————————————————————
+## WEEK 1 DAY 8
+## n 69 69 73 73
+## Mean (SD) 20.8 (4.1) 1.6 (6.1) 20.2 (4.1) -0.2 (5.6)
Finally, it would also be possible to wrap several expressions into a +single function.
+
+rtables_expr <- function(df,
+ paramcd,
+ arm,
+ visit,
+ .stats = NULL) {
+ # y is a list which will collect two expressions:
+ # 1. y$data with the preprocessing steps.
+ # 2. y$rtables the table layout and build.
+ y <- list()
+ # 1. Preprocessing ---
+ y$data <- substitute(
+ df <- df %>%
+ filter(
+ PARAMCD == paramcd &
+ ARMCD %in% c("ARM A", "ARM B") & AVISIT == "WEEK 1 DAY 8"
+ ),
+ list(
+ df = substitute(df),
+ paramcd = paramcd
+ )
+ )
+ # 2. rtables layout ---
+ lyt <- list()
+ lyt$structure <- substitute(
+ expr = basic_table() %>%
+ split_cols_by(arm, split_fun = drop_split_levels) %>%
+ split_rows_by(visit, split_fun = drop_split_levels) %>%
+ split_cols_by_multivar(
+ vars = c("AVAL", "CHG"),
+ varlabels = c("Value", "Change")
+ ),
+ env = list(
+ arm = arm,
+ visit = visit
+ )
+ )
+ lyt$analyze <- if (is.null(.stats)) {
+ substitute(
+ summarize_colvars()
+ )
+ } else {
+ substitute(
+ summarize_colvars(.stats = .stats),
+ list(.stats = .stats)
+ )
+ }
+ lyt$build <- substitute(
+ build_table(df = df),
+ list(df = substitute(df))
+ )
+ y$rtables <- pipe_expr(lyt)
+ # Finally returns y as a list with two expressions.
+ y
+}
It is now possible to modify the studied parameter
+(PARAMCD
) in addition to the study arm and visit variables
+names.
+adlb <- tmc_ex_adlb
+result <- rtables_expr(
+ df = adlb, paramcd = "CRP", arm = "ARM", visit = "AVISIT",
+ .stats = c("n", "mean_sd")
+)
The two expressions are consistent:
+
+styled_expr(result$data)
+## adlb <- adlb %>% filter(PARAMCD == "CRP" & ARMCD %in% c(
+## "ARM A",
+## "ARM B"
+## ) & AVISIT == "WEEK 1 DAY 8")
+styled_expr(result$rtables)
+## basic_table() %>%
+## split_cols_by("ARM", split_fun = drop_split_levels) %>%
+## split_rows_by("AVISIT", split_fun = drop_split_levels) %>%
+## split_cols_by_multivar(vars = c("AVAL", "CHG"), varlabels = c(
+## "Value",
+## "Change"
+## )) %>%
+## summarize_colvars(.stats = c("n", "mean_sd")) %>%
+## build_table(df = adlb)
The two expressions can be executed and return the
+rtables
:
+result_exec <- mapply(eval, result)
+result_exec$rtables
+## A: Drug X B: Placebo
+## Value Change Value Change
+## ————————————————————————————————————————————————————————————
+## WEEK 1 DAY 8
+## n 69 69 73 73
+## Mean (SD) 1.0 (0.2) 0.0 (0.3) 1.0 (0.2) 0.0 (0.3)
At this point, it is then possible to:
+rtables
pipelines.pipe_expr
)rtables
pipeline to add
+conditional layers (e.g. .stats
).rtables
+pipeline.vignettes/teal-modules-clinical.Rmd
+ teal-modules-clinical.Rmd
teal.modules.clinical
is a package implementing a number
+of teal
modules helpful for exploring clinical trials data,
+specifically targeted towards data following the ADaM
+standards. teal.modules.clinical
modules can be used with
+data other than ADaM standard clinical data, but some features of the
+package are tailored towards data of this type.
The concepts presented here require knowledge about the core features
+of teal
, specifically on how to launch a teal
+application and how to pass data into it. Therefore, it is highly
+recommended to refer to the home
+page and introductory
+vignette of the teal
package.
The package provides ready-to-use teal
modules you can
+embed in your teal
application. The modules generate highly
+customizable tables, plots, and outputs often used in exploratory data
+analysis, including:
tm_t_ancova()
+tm_t_coxreg()
+tm_g_km()
+tm_t_logistic()
+tm_g_barchart_simple()
+tm_g_ci()
+tm_t_binary_outcome()
+tm_t_events_summary()
+tm_t_smq()
+tm_t_tte()
+The library also offers a group of patient profile modules targeted +for clinical statisticians and physicians who want to review data on a +per patient basis. The modules present data about patient’s adverse +events, their severity, the current therapy, their laboratory results +and more.
+See the full index of package functions & modules here.
+A teal.modules.clinical
module needs to be embedded
+inside a shiny
/teal
application to interact
+with it. A simple application including a bar chart module could look
+like this:
+library(teal.modules.clinical)
+library(nestcolor)
+
+ADSL <- tmc_ex_adsl
+ADAE <- tmc_ex_adae
+
+app <- init(
+ data = cdisc_data(
+ ADSL = ADSL,
+ ADAE = ADAE,
+ code = "
+ ADSL <- tmc_ex_adsl
+ ADAE <- tmc_ex_adae
+ "
+ ),
+ modules = list(
+ tm_g_barchart_simple(
+ label = "ADAE Analysis",
+ x = data_extract_spec(
+ dataname = "ADAE",
+ select = select_spec(
+ choices = variable_choices(
+ ADAE,
+ c(
+ "ARM", "ACTARM", "SEX",
+ "RACE", "SAFFL", "STRATA2"
+ )
+ ),
+ selected = "ACTARM",
+ multiple = FALSE
+ )
+ )
+ )
+ )
+)
+
+if (interactive()) shinyApp(app$ui, app$server)
Consider consulting the documentation and examples of each module
+(e.g. ?tm_g_barchart_simple
). In many, you can also find
+useful links to the TLG
+Catalog where additional example apps can be found.
teal.modules.clinical
exports modules and needs support
+from other libraries to run a teal
app and flesh out its
+functionality. In the example above, tm_g_barchart_simple()
+is the only function from teal.modules.clinical
whereas
+init()
is a teal
function,
+data_extract_spec()
, select_spec()
, and
+variable_choices()
are teal.transform
+functions, and cdisc_data()
is a teal.data
+function.
Let’s break the above app down into pieces:
+ +The above lines load the libraries used in this example. We will use
+the example data provided in the teal.modules.clinical
+package:
+ADSL <- tmc_ex_adsl
+ADAE <- tmc_ex_adae
nestcolor
is an optional package that can be loaded in
+to apply the standardized NEST color palette to all module plots.
There is no need to load teal
as
+teal.modules.clinical
already depends on it.
In the next step, we use teal
to create
+shiny
UI and server functions that we can launch using
+shiny
. The data
argument tells
+teal
about the input data - the ADaM datasets
+ADSL
and ADAE
- and the modules
+argument indicates the modules included in the application. Here, we
+include only one module: tm_g_barchart_simple()
.
+app <- init(
+ data = cdisc_data(
+ ADSL = ADSL,
+ ADAE = ADAE,
+ code = "
+ ADSL <- tmc_ex_adsl
+ ADAE <- tmc_ex_adae
+ "
+ ),
+ modules = list(
+ tm_g_barchart_simple(
+ label = "ADAE Analysis",
+ x = data_extract_spec(
+ dataname = "ADAE",
+ select = select_spec(
+ choices = variable_choices(
+ ADAE,
+ c(
+ "ARM", "ACTARM", "SEX",
+ "RACE", "SAFFL", "STRATA2"
+ )
+ ),
+ selected = "ACTARM",
+ multiple = FALSE
+ )
+ )
+ )
+ )
+)
Finally, we use shiny
to launch the application:
+if (interactive()) shinyApp(app$ui, app$server)
Some teal.modules.clinical
modules allow for the
+specification of arguments using
+teal.transform::choices_selected()
, such as the
+tm_t_summary()
module in the following example.
+ADSL <- tmc_ex_adsl
+
+app <- init(
+ data = cdisc_data(ADSL = ADSL, code = "ADSL <- tmc_ex_adsl"),
+ modules = list(
+ tm_t_summary(
+ label = "Demographic Table",
+ dataname = "ADSL",
+ arm_var = choices_selected(choices = c("ARM", "ARMCD"), selected = "ARM"),
+ summarize_vars = choices_selected(
+ choices = c("SEX", "RACE", "BMRKR2", "EOSDY", "DCSREAS", "AGE"),
+ selected = c("SEX", "RACE")
+ )
+ )
+ )
+)
+
+if (interactive()) shinyApp(app$ui, app$server)
Please refer to the API +reference of specific modules for more examples and information on +the customization options available.
+