diff --git a/v0.9.1-rc3/404.html b/v0.9.1-rc3/404.html new file mode 100644 index 0000000000..c90d024914 --- /dev/null +++ b/v0.9.1-rc3/404.html @@ -0,0 +1,102 @@ + + +
+ + + + +.github/CODE_OF_CONDUCT.md
+ We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, caste, color, religion, or sexual identity and orientation.
+We pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community.
+Examples of behavior that contributes to a positive environment for our community include:
+Examples of unacceptable behavior include:
+Community leaders are responsible for clarifying and enforcing our standards of acceptable behavior and will take appropriate and fair corrective action in response to any behavior that they deem inappropriate, threatening, offensive, or harmful.
+Community leaders have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, and will communicate reasons for moderation decisions when appropriate.
+This Code of Conduct applies within all community spaces, and also applies when an individual is officially representing the community in public spaces. Examples of representing our community include using an official e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event.
+Instances of abusive, harassing, or otherwise unacceptable behavior may be reported to the community leaders responsible for enforcement at [INSERT CONTACT METHOD]. All complaints will be reviewed and investigated promptly and fairly.
+All community leaders are obligated to respect the privacy and security of the reporter of any incident.
+Community leaders will follow these Community Impact Guidelines in determining the consequences for any action they deem in violation of this Code of Conduct:
+Community Impact: Use of inappropriate language or other behavior deemed unprofessional or unwelcome in the community.
+Consequence: A private, written warning from community leaders, providing clarity around the nature of the violation and an explanation of why the behavior was inappropriate. A public apology may be requested.
+Community Impact: A violation through a single incident or series of actions.
+Consequence: A warning with consequences for continued behavior. No interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, for a specified period of time. This includes avoiding interactions in community spaces as well as external channels like social media. Violating these terms may lead to a temporary or permanent ban.
+Community Impact: A serious violation of community standards, including sustained inappropriate behavior.
+Consequence: A temporary ban from any sort of interaction or public communication with the community for a specified period of time. No public or private interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, is allowed during this period. Violating these terms may lead to a permanent ban.
+Community Impact: Demonstrating a pattern of violation of community standards, including sustained inappropriate behavior, harassment of an individual, or aggression toward or disparagement of classes of individuals.
+Consequence: A permanent ban from any sort of public interaction within the community.
+This Code of Conduct is adapted from the Contributor Covenant, version 2.1, available at https://www.contributor-covenant.org/version/2/1/code_of_conduct.html.
+Community Impact Guidelines were inspired by Mozilla’s code of conduct enforcement ladder.
+For answers to common questions about this code of conduct, see the FAQ at https://www.contributor-covenant.org/faq. Translations are available at https://www.contributor-covenant.org/translations.
+.github/CONTRIBUTING.md
+ 🙏 Thank you for taking the time to contribute!
+Your input is deeply valued, whether an issue, a pull request, or even feedback, regardless of size, content or scope.
+ +Please refer the project documentation for a brief introduction. Please also see other articles within the project documentation for additional information.
+A Code of Conduct governs this project. Participants and contributors are expected to follow the rules outlined therein.
+All your contributions will be covered by this project’s license.
+We use GitHub to track issues, feature requests, and bugs. Before submitting a new issue, please check if the issue has already been reported. If the issue already exists, please upvote the existing issue 👍.
+For new feature requests, please elaborate on the context and the benefit the feature will have for users, developers, or other relevant personas.
+This repository uses the GitHub Flow model for collaboration. To submit a pull request:
+Create a branch
+Please see the branch naming convention below. If you don’t have write access to this repository, please fork it.
+Make changes
+Make sure your code
+Create a pull request (PR)
+In the pull request description, please link the relevant issue (if any), provide a detailed description of the change, and include any assumptions.
+Address review comments, if any
Post approval
+Merge your PR if you have write access. Otherwise, the reviewer will merge the PR on your behalf.
+Pat yourself on the back
+Congratulations! 🎉 You are now an official contributor to this project! We are grateful for your contribution.
+Suppose your changes are related to a current issue in the current project; please name your branch as follows: <issue_id>_<short_description>
. Please use underscore (_
) as a delimiter for word separation. For example, 420_fix_ui_bug
would be a suitable branch name if your change is resolving and UI-related bug reported in issue number 420
in the current project.
If your change affects multiple repositories, please name your branches as follows: <issue_id>_<issue_repo>_<short description>
. For example, 69_awesomeproject_fix_spelling_error
would reference issue 69
reported in project awesomeproject
and aims to resolve one or more spelling errors in multiple (likely related) repositories.
monorepo
and staged.dependencies
+Sometimes you might need to change upstream dependent package(s) to be able to submit a meaningful change. We are using staged.dependencies
functionality to simulate a monorepo
behavior. The dependency configuration is already specified in this project’s staged_dependencies.yaml
file. You need to name the feature branches appropriately. This is the only exception from the branch naming convention described above.
Please refer to the staged.dependencies package documentation for more details.
+This repository follows some unified processes and standards adopted by its maintainers to ensure software development is carried out consistently within teams and cohesively across other repositories.
+This repository follows the standard tidyverse
style guide and uses lintr
for lint checks. Customized lint configurations are available in this repository’s .lintr
file.
Lightweight is the right weight. This repository follows tinyverse recommedations of limiting dependencies to minimum.
+If the code is not compatible with all (!) historical versions of a given dependenct package, it is required to specify minimal version in the DESCRIPTION
file. In particular: if the development version requires (imports) the development version of another package - it is required to put abc (>= 1.2.3.9000)
.
We continuously test our packages against the newest R version along with the most recent dependencies from CRAN and BioConductor. We recommend that your working environment is also set up in the same way. You can find the details about the R version and packages used in the R CMD check
GitHub Action execution log - there is a step that prints out the R sessionInfo()
.
If you discover bugs on older R versions or with an older set of dependencies, please create the relevant bug reports.
+pre-commit
We highly recommend that you use the pre-commit
tool combined with R hooks for pre-commit
to execute some of the checks before committing and pushing your changes.
Pre-commit hooks are already available in this repository’s .pre-commit-config.yaml
file.
As mentioned previously, all contributions are deeply valued and appreciated. While all contribution data is available as part of the repository insights, to recognize a significant contribution and hence add the contributor to the package authors list, the following rules are enforced:
+git blame
query) OR*Excluding auto-generated code, including but not limited to roxygen
comments or renv.lock
files.
The package maintainer also reserves the right to adjust the criteria to recognize contributions.
+Copyright 2022 F. Hoffmann-La Roche AG + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + https://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. ++ +
SECURITY.md
+ If you believe you have found a security vulnerability in any of the repositories in this organization, please report it to us through coordinated disclosure.
+Please do not report security vulnerabilities through public GitHub issues, discussions, or pull requests.
+Instead, please send an email to vulnerability.management[@]roche.com.
+Please include as much of the information listed below as you can to help us better understand and resolve the issue:
+This information will help us triage your report more quickly.
+vignettes/missing_values.Rmd
+ missing_values.Rmd
The packages used in this vignette are:
+ +rtables
requires that split variables to be factors.
+When you try and split a variable that isn’t, a warning message will
+appear. Here we purposefully convert the SEX variable to character to
+demonstrate what happens when we try splitting the rows by this
+variable. To fix this, df_explict_na
will convert this to a
+factor resulting in the table being generated.
+adsl <- tern_ex_adsl
+adsl$SEX <- as.character(adsl$SEX)
+
+vars <- c("AGE", "SEX", "RACE", "BMRKR1")
+var_labels <- c(
+ "Age (yr)",
+ "Sex",
+ "Race",
+ "Continous Level Biomarker 1"
+)
+
+result <- basic_table(show_colcounts = TRUE) %>%
+ split_cols_by(var = "ARM") %>%
+ add_overall_col("All Patients") %>%
+ analyze_vars(
+ vars = vars,
+ var_labels = var_labels
+ ) %>%
+ build_table(adsl)
+#> Warning in as_factor_keep_attributes(x, verbose = verbose): automatically
+#> converting character variable x to factor, better manually convert to factor to
+#> avoid failures
+
+#> Warning in as_factor_keep_attributes(x, verbose = verbose): automatically
+#> converting character variable x to factor, better manually convert to factor to
+#> avoid failures
+
+#> Warning in as_factor_keep_attributes(x, verbose = verbose): automatically
+#> converting character variable x to factor, better manually convert to factor to
+#> avoid failures
+
+#> Warning in as_factor_keep_attributes(x, verbose = verbose): automatically
+#> converting character variable x to factor, better manually convert to factor to
+#> avoid failures
+result
+#> A: Drug X B: Placebo C: Combination All Patients
+#> (N=69) (N=73) (N=58) (N=200)
+#> ———————————————————————————————————————————————————————————————————————————————————————————————————————
+#> Age (yr)
+#> n 69 73 58 200
+#> Mean (SD) 34.1 (6.8) 35.8 (7.1) 36.1 (7.4) 35.3 (7.1)
+#> Median 32.8 35.4 36.2 34.8
+#> Min - Max 22.4 - 48.0 23.3 - 57.5 23.0 - 58.3 22.4 - 58.3
+#> Sex
+#> n 69 73 58 200
+#> F 38 (55.1%) 40 (54.8%) 32 (55.2%) 110 (55%)
+#> M 31 (44.9%) 33 (45.2%) 26 (44.8%) 90 (45%)
+#> Race
+#> n 69 73 58 200
+#> ASIAN 38 (55.1%) 43 (58.9%) 29 (50%) 110 (55%)
+#> BLACK OR AFRICAN AMERICAN 15 (21.7%) 13 (17.8%) 12 (20.7%) 40 (20%)
+#> WHITE 11 (15.9%) 12 (16.4%) 11 (19%) 34 (17%)
+#> AMERICAN INDIAN OR ALASKA NATIVE 4 (5.8%) 3 (4.1%) 6 (10.3%) 13 (6.5%)
+#> MULTIPLE 1 (1.4%) 1 (1.4%) 0 2 (1%)
+#> NATIVE HAWAIIAN OR OTHER PACIFIC ISLANDER 0 1 (1.4%) 0 1 (0.5%)
+#> OTHER 0 0 0 0
+#> UNKNOWN 0 0 0 0
+#> Continous Level Biomarker 1
+#> n 69 73 58 200
+#> Mean (SD) 6.3 (3.6) 6.7 (3.5) 6.2 (3.3) 6.4 (3.5)
+#> Median 5.4 6.3 5.4 5.6
+#> Min - Max 0.4 - 17.8 1.0 - 18.5 2.4 - 19.1 0.4 - 19.1
rtables
+Here we purposefully convert all M
values to
+NA
in the SEX
variable. After running
+df_explicit_na
the NA
values are encoded as
+<Missing>
but they are not included in the table. As
+well, the missing values are not included in the n
count
+and they are not included in the denominator value for calculating the
+percent values.
+adsl <- tern_ex_adsl
+adsl$SEX[adsl$SEX == "M"] <- NA
+adsl <- df_explicit_na(adsl)
+
+vars <- c("AGE", "SEX")
+var_labels <- c(
+ "Age (yr)",
+ "Sex"
+)
+
+result <- basic_table(show_colcounts = TRUE) %>%
+ split_cols_by(var = "ARM") %>%
+ add_overall_col("All Patients") %>%
+ analyze_vars(
+ vars = vars,
+ var_labels = var_labels
+ ) %>%
+ build_table(adsl)
+result
+#> A: Drug X B: Placebo C: Combination All Patients
+#> (N=69) (N=73) (N=58) (N=200)
+#> ———————————————————————————————————————————————————————————————————————
+#> Age (yr)
+#> n 69 73 58 200
+#> Mean (SD) 34.1 (6.8) 35.8 (7.1) 36.1 (7.4) 35.3 (7.1)
+#> Median 32.8 35.4 36.2 34.8
+#> Min - Max 22.4 - 48.0 23.3 - 57.5 23.0 - 58.3 22.4 - 58.3
+#> Sex
+#> n 38 40 32 110
+#> F 38 (100%) 40 (100%) 32 (100%) 110 (100%)
+#> M 0 0 0 0
If you want the Na
values to be displayed in the table
+and included in the n
count and as the denominator for
+calculating percent values, use the na_level
argument.
+adsl <- tern_ex_adsl
+adsl$SEX[adsl$SEX == "M"] <- NA
+adsl <- df_explicit_na(adsl, na_level = "Missing Values")
+
+result <- basic_table(show_colcounts = TRUE) %>%
+ split_cols_by(var = "ARM") %>%
+ add_overall_col("All Patients") %>%
+ analyze_vars(
+ vars = vars,
+ var_labels = var_labels
+ ) %>%
+ build_table(adsl)
+result
+#> A: Drug X B: Placebo C: Combination All Patients
+#> (N=69) (N=73) (N=58) (N=200)
+#> ————————————————————————————————————————————————————————————————————————————
+#> Age (yr)
+#> n 69 73 58 200
+#> Mean (SD) 34.1 (6.8) 35.8 (7.1) 36.1 (7.4) 35.3 (7.1)
+#> Median 32.8 35.4 36.2 34.8
+#> Min - Max 22.4 - 48.0 23.3 - 57.5 23.0 - 58.3 22.4 - 58.3
+#> Sex
+#> n 69 73 58 200
+#> F 38 (55.1%) 40 (54.8%) 32 (55.2%) 110 (55%)
+#> M 0 0 0 0
+#> Missing Values 31 (44.9%) 33 (45.2%) 26 (44.8%) 90 (45%)
Numeric variables that have missing values are not altered. This
+means that any NA
value in a numeric variable will not be
+included in the summary statistics, nor will they be included in the
+denominator value for calculating the percent values. Here we make any
+value less than 30 missing in the AGE
variable and only the
+valued greater than 30 are included in the table below.
+adsl <- tern_ex_adsl
+adsl$AGE[adsl$AGE < 30] <- NA
+adsl <- df_explicit_na(adsl)
+
+vars <- c("AGE", "SEX")
+var_labels <- c(
+ "Age (yr)",
+ "Sex"
+)
+
+result <- basic_table(show_colcounts = TRUE) %>%
+ split_cols_by(var = "ARM") %>%
+ add_overall_col("All Patients") %>%
+ analyze_vars(
+ vars = vars,
+ var_labels = var_labels
+ ) %>%
+ build_table(adsl)
+result
+#> A: Drug X B: Placebo C: Combination All Patients
+#> (N=69) (N=73) (N=58) (N=200)
+#> ———————————————————————————————————————————————————————————————————————
+#> Age (yr)
+#> n 46 56 44 146
+#> Mean (SD) 37.8 (5.2) 38.3 (6.3) 39.1 (5.9) 38.3 (5.8)
+#> Median 37.2 37.3 37.5 37.5
+#> Min - Max 30.3 - 48.0 30.0 - 57.5 30.5 - 58.3 30.0 - 58.3
+#> Sex
+#> n 69 73 58 200
+#> F 38 (55.1%) 40 (54.8%) 32 (55.2%) 110 (55%)
+#> M 31 (44.9%) 33 (45.2%) 26 (44.8%) 90 (45%)
tern
Tabulation
+The tern
R package provides functions to create common
+analyses from clinical trials in R
. The core functionality
+for tabulation is built on the more general purpose rtables
+package. New users should first begin by reading the “Introduction
+to tern” and “Introduction
+to rtables
” vignettes.
The packages used in this vignette are:
+ +The datasets used in this vignette are:
+
+adsl <- ex_adsl
+adae <- ex_adae
+adrs <- ex_adrs
tern
Analyze Functions
+Analyze functions are used in combination with the
+rtables
layout functions, in the pipeline which creates the
+rtables
table. They apply some statistical logic to the
+layout of the rtables
table. The table layout is
+materialized with the rtables::build_table
function and the
+data.
The tern
analyze functions are wrappers around
+rtables::analyze
function, they offer various methods
+useful from the perspective of clinical trials and other statistical
+projects.
Examples of the tern
analyze functions are
+tern::count_occurrences
,
+tern::summarize_ancova
or tern::analyze_vars
.
+As there is no one prefix to identify all tern
analyze
+functions it is recommended to use the the
+tern website functions reference.
tern
Analyze Functions
+Please skip this subsection if you are not interested in the
+internals of tern
analyze functions.
Internally tern
analyze functions like
+tern::summarize_ancova
are mainly built in the 4 elements
+chain:
h_ancova() -> tern:::s_ancova() -> tern:::a_ancova() -> summarize_ancova()
+The descriptions for each function type:
+h_*
. These functions are
+useful to help define the analysis.s_*
. Statistics functions should do
+the computation of the numbers that are tabulated later. In order to
+separate computation from formatting, they should not take care of
+rcell
type formatting themselves.a_*
. These have the same
+arguments as the corresponding statistics functions, and can be further
+customized by calling rtables::make_afun()
on them. They
+are used as afun
in rtables::analyze()
.rtables::analyze(..., afun = make_afun(tern::a_*))
. Analyze
+functions are used in combination with the rtables
layout
+functions, in the pipeline which creates the table. They are the last
+element of the chain.We will use the native rtables::analyze
function with
+the tern
formatted analysis functions as a
+afun
parameter.
l <- basic_table() %>%
+ split_cols_by(var = "ARM") %>%
+ split_rows_by(var = "AVISIT") %>%
+ analyze(vars = "AVAL", afun = a_summary)
+
+build_table(l, df = adrs)
+The rtables::make_afun
function is helpful when somebody
+wants to attach some format to the formatted analysis function.
afun <- make_afun(
+ a_summary,
+ .stats = NULL,
+ .formats = c(median = "xx."),
+ .labels = c(median = "My median"),
+ .indent_mods = c(median = 1L)
+)
+
+l2 <- basic_table() %>%
+ split_cols_by(var = "ARM") %>%
+ split_rows_by(var = "AVISIT") %>%
+ analyze(vars = "AVAL", afun = afun)
+
+build_table(l2, df = adrs)
+We are going to create 3 different tables using tern
+analyze functions and the rtables
interface.
Table | +
+tern analyze functions |
+
---|---|
Demographic Table | +
+analyze_vars() and
+summarize_num_patients()
+ |
+
Adverse event Table | +count_occurrences() |
+
Response Table | +
+estimate_proportion() ,
+estimate_proportion_diff() and
+test_proportion_diff()
+ |
+
Demographic tables provide a summary of the characteristics of +patients enrolled in a clinical trial. Typically the table columns +represent treatment arms and variables summarized in the table are +demographic properties such as age, sex, race, etc.
+In the example below the only function from tern
is
+analyze_vars()
and the remaining layout functions are from
+rtables
.
+# Select variables to include in table.
+vars <- c("AGE", "SEX")
+var_labels <- c("Age (yr)", "Sex")
+
+basic_table() %>%
+ split_cols_by(var = "ARM") %>%
+ add_overall_col("All Patients") %>%
+ add_colcounts() %>%
+ analyze_vars(
+ vars = vars,
+ var_labels = var_labels
+ ) %>%
+ build_table(adsl)
+#> A: Drug X B: Placebo C: Combination All Patients
+#> (N=134) (N=134) (N=132) (N=400)
+#> ——————————————————————————————————————————————————————————————————————————————
+#> Age (yr)
+#> n 134 134 132 400
+#> Mean (SD) 33.8 (6.6) 35.4 (7.9) 35.4 (7.7) 34.9 (7.4)
+#> Median 33.0 35.0 35.0 34.0
+#> Min - Max 21.0 - 50.0 21.0 - 62.0 20.0 - 69.0 20.0 - 69.0
+#> Sex
+#> n 134 134 132 400
+#> F 79 (59%) 77 (57.5%) 66 (50%) 222 (55.5%)
+#> M 51 (38.1%) 55 (41%) 60 (45.5%) 166 (41.5%)
+#> U 3 (2.2%) 2 (1.5%) 4 (3%) 9 (2.2%)
+#> UNDIFFERENTIATED 1 (0.7%) 0 2 (1.5%) 3 (0.8%)
To change the display order of categorical variables in a table use
+factor variables and explicitly set the order of the levels. This is the
+case for the display order in columns and rows. Note that the
+forcats
package has many useful functions to help with
+these types of data processing steps (not used below).
+# Reorder the levels in the ARM variable.
+adsl$ARM <- factor(adsl$ARM, levels = c("B: Placebo", "A: Drug X", "C: Combination")) # nolint
+
+# Reorder the levels in the SEX variable.
+adsl$SEX <- factor(adsl$SEX, levels = c("M", "F", "U", "UNDIFFERENTIATED")) # nolint
+
+basic_table() %>%
+ split_cols_by(var = "ARM") %>%
+ add_overall_col("All Patients") %>%
+ add_colcounts() %>%
+ analyze_vars(
+ vars = vars,
+ var_labels = var_labels
+ ) %>%
+ build_table(adsl)
+#> B: Placebo A: Drug X C: Combination All Patients
+#> (N=134) (N=134) (N=132) (N=400)
+#> ——————————————————————————————————————————————————————————————————————————————
+#> Age (yr)
+#> n 134 134 132 400
+#> Mean (SD) 35.4 (7.9) 33.8 (6.6) 35.4 (7.7) 34.9 (7.4)
+#> Median 35.0 33.0 35.0 34.0
+#> Min - Max 21.0 - 62.0 21.0 - 50.0 20.0 - 69.0 20.0 - 69.0
+#> Sex
+#> n 134 134 132 400
+#> M 55 (41%) 51 (38.1%) 60 (45.5%) 166 (41.5%)
+#> F 77 (57.5%) 79 (59%) 66 (50%) 222 (55.5%)
+#> U 2 (1.5%) 3 (2.2%) 4 (3%) 9 (2.2%)
+#> UNDIFFERENTIATED 0 1 (0.7%) 2 (1.5%) 3 (0.8%)
The tern
package includes many functions similar to
+analyze_vars()
. These functions are called layout creating
+functions and are used in combination with other rtables
+layout functions just like in the examples above. Layout creating
+functions are wrapping calls to rtables
+analyze()
, analyze_colvars()
and
+summarize_row_groups()
and provide options for easy
+formatting and analysis modifications.
To customize the display for the demographics table, we can do so via
+the arguments in analyze_vars()
. Most layout creating
+functions in tern
include the standard arguments
+.stats
, .formats
, .labels
and
+.indent_mods
which control which statistics are displayed
+and how the numbers are formatted. Refer to the package help with
+help("analyze_vars")
or ?analyze_vars
to see
+the full set of options.
For this example we will change the default summary for numeric +variables to include the number of records, and the mean and standard +deviation (in a single statistic, i.e. within a single cell). For +categorical variables we modify the summary to include the number of +records and the counts of categories. We also modify the display format +for the mean and standard deviation to print two decimal places instead +of just one.
+
+# Select statistics and modify default formats.
+basic_table() %>%
+ split_cols_by(var = "ARM") %>%
+ add_overall_col("All Patients") %>%
+ add_colcounts() %>%
+ analyze_vars(
+ vars = vars,
+ var_labels = var_labels,
+ .stats = c("n", "mean_sd", "count"),
+ .formats = c(mean_sd = "xx.xx (xx.xx)")
+ ) %>%
+ build_table(adsl)
+#> B: Placebo A: Drug X C: Combination All Patients
+#> (N=134) (N=134) (N=132) (N=400)
+#> ————————————————————————————————————————————————————————————————————————————————
+#> Age (yr)
+#> n 134 134 132 400
+#> Mean (SD) 35.43 (7.90) 33.77 (6.55) 35.43 (7.72) 34.88 (7.44)
+#> Sex
+#> n 134 134 132 400
+#> M 55 51 60 166
+#> F 77 79 66 222
+#> U 2 3 4 9
+#> UNDIFFERENTIATED 0 1 2 3
One feature of a layout
is that it can be used with
+different datasets to create different summaries. For example, here we
+can easily create the same summary of demographics for the Brazil and
+China subgroups, respectively:
+lyt <- basic_table() %>%
+ split_cols_by(var = "ARM") %>%
+ add_overall_col("All Patients") %>%
+ add_colcounts() %>%
+ analyze_vars(
+ vars = vars,
+ var_labels = var_labels
+ )
+
+build_table(lyt, df = adsl %>% dplyr::filter(COUNTRY == "BRA"))
+#> B: Placebo A: Drug X C: Combination All Patients
+#> (N=7) (N=13) (N=10) (N=30)
+#> ——————————————————————————————————————————————————————————————————————————————
+#> Age (yr)
+#> n 7 13 10 30
+#> Mean (SD) 32.0 (6.1) 36.7 (6.4) 38.3 (10.6) 36.1 (8.1)
+#> Median 32.0 37.0 35.0 35.5
+#> Min - Max 25.0 - 42.0 24.0 - 47.0 25.0 - 64.0 24.0 - 64.0
+#> Sex
+#> n 7 13 10 30
+#> M 4 (57.1%) 8 (61.5%) 5 (50%) 17 (56.7%)
+#> F 3 (42.9%) 5 (38.5%) 5 (50%) 13 (43.3%)
+#> U 0 0 0 0
+#> UNDIFFERENTIATED 0 0 0 0
+
+build_table(lyt, df = adsl %>% dplyr::filter(COUNTRY == "CHN"))
+#> B: Placebo A: Drug X C: Combination All Patients
+#> (N=81) (N=74) (N=64) (N=219)
+#> ——————————————————————————————————————————————————————————————————————————————
+#> Age (yr)
+#> n 81 74 64 219
+#> Mean (SD) 35.7 (7.3) 33.0 (6.4) 35.2 (6.4) 34.6 (6.8)
+#> Median 36.0 32.0 35.0 34.0
+#> Min - Max 21.0 - 58.0 23.0 - 48.0 21.0 - 49.0 21.0 - 58.0
+#> Sex
+#> n 81 74 64 219
+#> M 35 (43.2%) 27 (36.5%) 30 (46.9%) 92 (42%)
+#> F 45 (55.6%) 44 (59.5%) 29 (45.3%) 118 (53.9%)
+#> U 1 (1.2%) 2 (2.7%) 3 (4.7%) 6 (2.7%)
+#> UNDIFFERENTIATED 0 1 (1.4%) 2 (3.1%) 3 (1.4%)
The standard table of adverse events is a summary by system organ
+class and preferred term. For frequency counts by preferred term, if
+there are multiple occurrences of the same AE
in an
+individual we count them only once.
To create this table we will need to use a combination of several +layout creating functions in a tabulation pipeline.
+We start by creating the high-level summary. The layout creating
+function in tern
that can do this is
+summarize_num_patients()
:
+basic_table() %>%
+ split_cols_by(var = "ACTARM") %>%
+ add_colcounts() %>%
+ add_overall_col(label = "All Patients") %>%
+ summarize_num_patients(
+ var = "USUBJID",
+ .stats = c("unique", "nonunique"),
+ .labels = c(
+ unique = "Total number of patients with at least one AE",
+ nonunique = "Overall total number of events"
+ )
+ ) %>%
+ build_table(
+ df = adae,
+ alt_counts_df = adsl
+ )
+#> A: Drug X B: Placebo C: Combination All Patients
+#> (N=134) (N=134) (N=132) (N=400)
+#> —————————————————————————————————————————————————————————————————————————————————————————————————————————
+#> Total number of patients with at least one AE 122 (91.0%) 123 (91.8%) 120 (90.9%) 365 (91.2%)
+#> Overall total number of events 609 622 703 1934
Note that for this table, the denominator used for percentages and
+shown in the header of the table (N = xx)
is defined based
+on the subject-level dataset adsl
. This is done by using
+the alt_df_counts
argument in build_table()
,
+which provides an alternative data set for deriving the counts in the
+header. This is often required when we work with data sets that include
+multiple records per patient as df
, such as
+adae
here.
Before building out the rest of the AE
table it is
+helpful to introduce some more tern
package design
+conventions. Each layout creating function in tern
is a
+wrapper for a Statistics function. Statistics functions are the ones
+that do the actual computation of numbers in a table. These functions
+always return named lists whose elements are the statistics available to
+include in a layout via the .stats
argument at the layout
+creating function level.
Statistics functions follow a naming convention to always begin with
+s_*
and for ease of use are documented on the same page as
+their layout creating function counterpart. It is helpful to review a
+Statistic function to understand the logic used to calculate the numbers
+in a table and see what options may be available to modify the
+analysis.
For example, the Statistics function calculating the numbers in
+summarize_num_patients()
is s_num_patients()
.
+The results of this Statistics function is a list with the elements
+unique
, nonunique
and
+unique_count
:
+s_num_patients(x = adae$USUBJID, labelstr = "", .N_col = nrow(adae))
+#> $unique
+#> [1] 365.000000 0.188728
+#> attr(,"label")
+#> [1] ""
+#>
+#> $nonunique
+#> [1] 1934
+#> attr(,"label")
+#> [1] ""
+#>
+#> $unique_count
+#> [1] 365
+#> attr(,"label")
+#> [1] " (n)"
From these results you can see that the unique
and
+nonunique
statistics are those displayed in the “All
+Patients” column in the initial AE
table output above. Also
+you can see that these are raw numbers and are not formatted in any way.
+All formatting functionality is handled at the layout creating function
+level with the .formats
argument.
Now that we know what types of statistics can be derived by
+s_num_patients()
, we can try modifying the default layout
+returned by summarize_num_patients()
. Instead of reporting
+the unique
and nonqunie
statistics, we specify
+that the analysis should include only the unique_count
+statistic. The result will show only the counts of unique patients. Note
+we make this update in both the .stats
and
+.labels
argument of
+summarize_num_patients()
.
+basic_table() %>%
+ split_cols_by(var = "ACTARM") %>%
+ add_colcounts() %>%
+ add_overall_col(label = "All Patients") %>%
+ summarize_num_patients(
+ var = "USUBJID",
+ .stats = "unique_count",
+ .labels = c(unique_count = "Total number of patients with at least one AE")
+ ) %>%
+ build_table(
+ df = adae,
+ alt_counts_df = adsl
+ )
+#> A: Drug X B: Placebo C: Combination All Patients
+#> (N=134) (N=134) (N=132) (N=400)
+#> ——————————————————————————————————————————————————————————————————————————————————————————————————————
+#> Total number of patients with at least one AE 122 123 120 365
Let’s now continue building on the layout for the adverse event +table.
+After we have the top-level summary, we can repeat the same summary
+at each system organ class level. To do this we split the analysis data
+with split_rows_by()
before calling again
+summarize_num_patients()
.
+basic_table() %>%
+ split_cols_by(var = "ACTARM") %>%
+ add_colcounts() %>%
+ add_overall_col(label = "All Patients") %>%
+ summarize_num_patients(
+ var = "USUBJID",
+ .stats = c("unique", "nonunique"),
+ .labels = c(
+ unique = "Total number of patients with at least one AE",
+ nonunique = "Overall total number of events"
+ )
+ ) %>%
+ split_rows_by(
+ "AEBODSYS",
+ child_labels = "visible",
+ nested = FALSE,
+ indent_mod = -1L,
+ split_fun = drop_split_levels
+ ) %>%
+ summarize_num_patients(
+ var = "USUBJID",
+ .stats = c("unique", "nonunique"),
+ .labels = c(
+ unique = "Total number of patients with at least one AE",
+ nonunique = "Overall total number of events"
+ )
+ ) %>%
+ build_table(
+ df = adae,
+ alt_counts_df = adsl
+ )
+#> A: Drug X B: Placebo C: Combination All Patients
+#> (N=134) (N=134) (N=132) (N=400)
+#> ———————————————————————————————————————————————————————————————————————————————————————————————————————————
+#> Total number of patients with at least one AE 122 (91.0%) 123 (91.8%) 120 (90.9%) 365 (91.2%)
+#> Overall total number of events 609 622 703 1934
+#> cl A.1
+#> Total number of patients with at least one AE 78 (58.2%) 75 (56.0%) 89 (67.4%) 242 (60.5%)
+#> Overall total number of events 132 130 160 422
+#> cl B.1
+#> Total number of patients with at least one AE 47 (35.1%) 49 (36.6%) 43 (32.6%) 139 (34.8%)
+#> Overall total number of events 56 60 62 178
+#> cl B.2
+#> Total number of patients with at least one AE 79 (59.0%) 74 (55.2%) 85 (64.4%) 238 (59.5%)
+#> Overall total number of events 129 138 143 410
+#> cl C.1
+#> Total number of patients with at least one AE 43 (32.1%) 46 (34.3%) 43 (32.6%) 132 (33.0%)
+#> Overall total number of events 55 63 64 182
+#> cl C.2
+#> Total number of patients with at least one AE 35 (26.1%) 48 (35.8%) 55 (41.7%) 138 (34.5%)
+#> Overall total number of events 48 53 65 166
+#> cl D.1
+#> Total number of patients with at least one AE 79 (59.0%) 67 (50.0%) 80 (60.6%) 226 (56.5%)
+#> Overall total number of events 127 106 135 368
+#> cl D.2
+#> Total number of patients with at least one AE 47 (35.1%) 58 (43.3%) 57 (43.2%) 162 (40.5%)
+#> Overall total number of events 62 72 74 208
The table looks almost ready. For the final step, we need a layout
+creating function that can produce a count table of event frequencies.
+The layout creating function for this is
+count_occurrences()
. Let’s first try using this function in
+a simpler layout without row splits:
+basic_table() %>%
+ split_cols_by(var = "ACTARM") %>%
+ add_colcounts() %>%
+ add_overall_col(label = "All Patients") %>%
+ count_occurrences(vars = "AEDECOD") %>%
+ build_table(
+ df = adae,
+ alt_counts_df = adsl
+ )
+#> A: Drug X B: Placebo C: Combination All Patients
+#> (N=134) (N=134) (N=132) (N=400)
+#> ———————————————————————————————————————————————————————————————————————
+#> dcd A.1.1.1.1 50 (37.3%) 45 (33.6%) 63 (47.7%) 158 (39.5%)
+#> dcd A.1.1.1.2 48 (35.8%) 48 (35.8%) 50 (37.9%) 146 (36.5%)
+#> dcd B.1.1.1.1 47 (35.1%) 49 (36.6%) 43 (32.6%) 139 (34.8%)
+#> dcd B.2.1.2.1 49 (36.6%) 44 (32.8%) 52 (39.4%) 145 (36.2%)
+#> dcd B.2.2.3.1 48 (35.8%) 54 (40.3%) 51 (38.6%) 153 (38.2%)
+#> dcd C.1.1.1.3 43 (32.1%) 46 (34.3%) 43 (32.6%) 132 (33.0%)
+#> dcd C.2.1.2.1 35 (26.1%) 48 (35.8%) 55 (41.7%) 138 (34.5%)
+#> dcd D.1.1.1.1 50 (37.3%) 42 (31.3%) 51 (38.6%) 143 (35.8%)
+#> dcd D.1.1.4.2 48 (35.8%) 42 (31.3%) 50 (37.9%) 140 (35.0%)
+#> dcd D.2.1.5.3 47 (35.1%) 58 (43.3%) 57 (43.2%) 162 (40.5%)
Putting everything together, the final AE
table looks
+like this:
+basic_table() %>%
+ split_cols_by(var = "ACTARM") %>%
+ add_colcounts() %>%
+ add_overall_col(label = "All Patients") %>%
+ summarize_num_patients(
+ var = "USUBJID",
+ .stats = c("unique", "nonunique"),
+ .labels = c(
+ unique = "Total number of patients with at least one AE",
+ nonunique = "Overall total number of events"
+ )
+ ) %>%
+ split_rows_by(
+ "AEBODSYS",
+ child_labels = "visible",
+ nested = FALSE,
+ indent_mod = -1L,
+ split_fun = drop_split_levels
+ ) %>%
+ summarize_num_patients(
+ var = "USUBJID",
+ .stats = c("unique", "nonunique"),
+ .labels = c(
+ unique = "Total number of patients with at least one AE",
+ nonunique = "Overall total number of events"
+ )
+ ) %>%
+ count_occurrences(vars = "AEDECOD") %>%
+ build_table(
+ df = adae,
+ alt_counts_df = adsl
+ )
+#> A: Drug X B: Placebo C: Combination All Patients
+#> (N=134) (N=134) (N=132) (N=400)
+#> ———————————————————————————————————————————————————————————————————————————————————————————————————————————
+#> Total number of patients with at least one AE 122 (91.0%) 123 (91.8%) 120 (90.9%) 365 (91.2%)
+#> Overall total number of events 609 622 703 1934
+#> cl A.1
+#> Total number of patients with at least one AE 78 (58.2%) 75 (56.0%) 89 (67.4%) 242 (60.5%)
+#> Overall total number of events 132 130 160 422
+#> dcd A.1.1.1.1 50 (37.3%) 45 (33.6%) 63 (47.7%) 158 (39.5%)
+#> dcd A.1.1.1.2 48 (35.8%) 48 (35.8%) 50 (37.9%) 146 (36.5%)
+#> cl B.1
+#> Total number of patients with at least one AE 47 (35.1%) 49 (36.6%) 43 (32.6%) 139 (34.8%)
+#> Overall total number of events 56 60 62 178
+#> dcd B.1.1.1.1 47 (35.1%) 49 (36.6%) 43 (32.6%) 139 (34.8%)
+#> cl B.2
+#> Total number of patients with at least one AE 79 (59.0%) 74 (55.2%) 85 (64.4%) 238 (59.5%)
+#> Overall total number of events 129 138 143 410
+#> dcd B.2.1.2.1 49 (36.6%) 44 (32.8%) 52 (39.4%) 145 (36.2%)
+#> dcd B.2.2.3.1 48 (35.8%) 54 (40.3%) 51 (38.6%) 153 (38.2%)
+#> cl C.1
+#> Total number of patients with at least one AE 43 (32.1%) 46 (34.3%) 43 (32.6%) 132 (33.0%)
+#> Overall total number of events 55 63 64 182
+#> dcd C.1.1.1.3 43 (32.1%) 46 (34.3%) 43 (32.6%) 132 (33.0%)
+#> cl C.2
+#> Total number of patients with at least one AE 35 (26.1%) 48 (35.8%) 55 (41.7%) 138 (34.5%)
+#> Overall total number of events 48 53 65 166
+#> dcd C.2.1.2.1 35 (26.1%) 48 (35.8%) 55 (41.7%) 138 (34.5%)
+#> cl D.1
+#> Total number of patients with at least one AE 79 (59.0%) 67 (50.0%) 80 (60.6%) 226 (56.5%)
+#> Overall total number of events 127 106 135 368
+#> dcd D.1.1.1.1 50 (37.3%) 42 (31.3%) 51 (38.6%) 143 (35.8%)
+#> dcd D.1.1.4.2 48 (35.8%) 42 (31.3%) 50 (37.9%) 140 (35.0%)
+#> cl D.2
+#> Total number of patients with at least one AE 47 (35.1%) 58 (43.3%) 57 (43.2%) 162 (40.5%)
+#> Overall total number of events 62 72 74 208
+#> dcd D.2.1.5.3 47 (35.1%) 58 (43.3%) 57 (43.2%) 162 (40.5%)
A typical response table for a binary clinical trial endpoint may be +composed of several different analyses:
+We can build a table layout like this by following the same approach
+we used for the AE
table: each table section will be
+produced using a different layout creating function from
+tern
.
First we start with some data preparation steps to set up the
+analysis dataset. We select the endpoint to analyze from
+PARAMCD
and define the logical variable is_rsp
+which indicates whether a patient is classified as a responder or
+not.
+# Preprocessing to select an analysis endpoint.
+anl <- adrs %>%
+ dplyr::filter(PARAMCD == "BESRSPI") %>%
+ dplyr::mutate(is_rsp = AVALC %in% c("CR", "PR"))
To create a summary of the proportion of responders in each treatment
+group, use the estimate_proportion()
layout creating
+function:
+basic_table() %>%
+ split_cols_by(var = "ARM") %>%
+ add_colcounts() %>%
+ estimate_proportion(
+ vars = "is_rsp",
+ table_names = "est_prop"
+ ) %>%
+ build_table(anl)
+#> A: Drug X B: Placebo C: Combination
+#> (N=134) (N=134) (N=132)
+#> —————————————————————————————————————————————————————————————————————————————
+#> Responders 114 (85.1%) 90 (67.2%) 120 (90.9%)
+#> 95% CI (Wald, with correction) (78.7, 91.5) (58.8, 75.5) (85.6, 96.2)
To specify which arm in the table should be used as the reference,
+use the argument ref_group
from
+split_cols_by()
. Below we change the reference arm to “B:
+Placebo” and so this arm is displayed as the first column:
+basic_table() %>%
+ split_cols_by(var = "ARM", ref_group = "B: Placebo") %>%
+ add_colcounts() %>%
+ estimate_proportion(
+ vars = "is_rsp"
+ ) %>%
+ build_table(anl)
+#> B: Placebo A: Drug X C: Combination
+#> (N=134) (N=134) (N=132)
+#> —————————————————————————————————————————————————————————————————————————————
+#> Responders 90 (67.2%) 114 (85.1%) 120 (90.9%)
+#> 95% CI (Wald, with correction) (58.8, 75.5) (78.7, 91.5) (85.6, 96.2)
To further customize the analysis, we can use the method
+and conf_level
arguments to modify the type of confidence
+interval that is calculated:
+basic_table() %>%
+ split_cols_by(var = "ARM", ref_group = "B: Placebo") %>%
+ add_colcounts() %>%
+ estimate_proportion(
+ vars = "is_rsp",
+ method = "clopper-pearson",
+ conf_level = 0.9
+ ) %>%
+ build_table(anl)
+#> B: Placebo A: Drug X C: Combination
+#> (N=134) (N=134) (N=132)
+#> ———————————————————————————————————————————————————————————————————————
+#> Responders 90 (67.2%) 114 (85.1%) 120 (90.9%)
+#> 90% CI (Clopper-Pearson) (59.9, 73.9) (79.1, 89.9) (85.7, 94.7)
The next table section needed should summarize the difference in
+response rates between the reference arm each comparison arm. Use
+estimate_proportion_diff()
layout creating function for
+this:
+basic_table() %>%
+ split_cols_by(var = "ARM", ref_group = "B: Placebo") %>%
+ add_colcounts() %>%
+ estimate_proportion_diff(
+ vars = "is_rsp",
+ show_labels = "visible",
+ var_labels = "Unstratified Analysis"
+ ) %>%
+ build_table(anl)
+#> B: Placebo A: Drug X C: Combination
+#> (N=134) (N=134) (N=132)
+#> ——————————————————————————————————————————————————————————————————————————————
+#> Unstratified Analysis
+#> Difference in Response rate (%) 17.9 23.7
+#> 95% CI (Wald, with correction) (7.2, 28.6) (13.7, 33.8)
The final section needed to complete the table includes a statistical
+test for the difference in response rates. Use the
+test_proportion_diff()
layout creating function for
+this:
+basic_table() %>%
+ split_cols_by(var = "ARM", ref_group = "B: Placebo") %>%
+ add_colcounts() %>%
+ test_proportion_diff(vars = "is_rsp") %>%
+ build_table(anl)
+#> B: Placebo A: Drug X C: Combination
+#> (N=134) (N=134) (N=132)
+#> ——————————————————————————————————————————————————————————————————————
+#> p-value (Chi-Squared Test) 0.0006 <0.0001
To customize the output, we use the method
argument to
+select a Chi-Squared test with Schouten correction.
+basic_table() %>%
+ split_cols_by(var = "ARM", ref_group = "B: Placebo") %>%
+ add_colcounts() %>%
+ test_proportion_diff(
+ vars = "is_rsp",
+ method = "schouten"
+ ) %>%
+ build_table(anl)
+#> B: Placebo A: Drug X C: Combination
+#> (N=134) (N=134) (N=132)
+#> ———————————————————————————————————————————————————————————————————————————————————————————————
+#> p-value (Chi-Squared Test with Schouten Correction) 0.0008 <0.0001
Now we can put all the table sections together in one layout
+pipeline. Note there is one more small change needed. Since the primary
+analysis variable in all table sections is the same
+(is_rsp
), we need to give each sub-table a unique name.
+This is done by adding the table_names
argument and
+providing unique names through that:
+basic_table() %>%
+ split_cols_by(var = "ARM", ref_group = "B: Placebo") %>%
+ add_colcounts() %>%
+ estimate_proportion(
+ vars = "is_rsp",
+ method = "clopper-pearson",
+ conf_level = 0.9,
+ table_names = "est_prop"
+ ) %>%
+ estimate_proportion_diff(
+ vars = "is_rsp",
+ show_labels = "visible",
+ var_labels = "Unstratified Analysis",
+ table_names = "est_prop_diff"
+ ) %>%
+ test_proportion_diff(
+ vars = "is_rsp",
+ method = "schouten",
+ table_names = "test_prop_diff"
+ ) %>%
+ build_table(anl)
+#> B: Placebo A: Drug X C: Combination
+#> (N=134) (N=134) (N=132)
+#> ————————————————————————————————————————————————————————————————————————————————————————————————————
+#> Responders 90 (67.2%) 114 (85.1%) 120 (90.9%)
+#> 90% CI (Clopper-Pearson) (59.9, 73.9) (79.1, 89.9) (85.7, 94.7)
+#> Unstratified Analysis
+#> Difference in Response rate (%) 17.9 23.7
+#> 95% CI (Wald, with correction) (7.2, 28.6) (13.7, 33.8)
+#> p-value (Chi-Squared Test with Schouten Correction) 0.0008 <0.0001
Tabulation with tern
builds on top of the the layout
+tabulation framework from rtables
. Complex tables are built
+step by step in a pipeline by combining layout creating functions that
+perform a specific type of analysis.
The tern
analyze functions introduced in this vignette
+are:
analyze_vars()
summarize_num_patients()
count_occurrences()
estimate_proportion()
estimate_proportion_diff()
test_proportion_diff()
Layout creating functions build a formatted layout
by
+controlling features such as labels, numerical display formats and
+indentation. These functions are wrappers for the Statistics functions
+which calculate the raw summaries of each analysis. You can easily spot
+Statistics functions in the documentation because they always begin with
+the prefix s_
. It can be helpful to inspect and run
+Statistics functions to understand ways an analysis can be
+customized.
This vignette shows the general purpose and syntax of the
+tern
R package.
+The tern
R package contains analytical functions for
+creating tables and graphs useful for clinical trials and other
+statistical analysis. The main focus is on the clinical trial reporting
+tables but the graphs related to the clinical trials are also valuable.
+The core functionality for tabulation is built on top of the more
+general purpose rtables
package.
The package provides a large range of functionality to create tables +and graphs used for clinical trial and other statistical analysis.
+rtables
tabulation extended by clinical trials specific
+functions:
MMRM
, logistic regression, Cox
+regression, …rtables
tabulation helper functions:
data visualizations connected with clinical trials:
+data visualizations helper functions:
+The reference of tern
functions is available on the
+tern website functions reference.
rtables
+Analytical functions are used in combination with other
+rtables
layout functions, in the pipeline which creates the
+rtables
table. They apply some statistical logic to the
+layout of the rtables
table. The table layout is
+materialized with the rtables::build_table
function and the
+data.
The tern
analytical functions are wrappers around the
+rtables::analyze
function; they offer various methods
+useful from the perspective of clinical trials and other statistical
+projects.
Examples of the tern
analytical functions are
+tern::count_occurrences
,
+tern::summarize_ancova
and tern::analyze_vars
.
+As there is no one prefix to identify all tern
analytical
+functions it is recommended to use the reference subsection on the
+tern website.
In the rtables
code below we first describe the two
+tables and assign the descriptions to the variables lyt
and
+lyt2
. We then built the tables using the actual data with
+rtables::build_table
. The description of a table is called
+a table layout. The analyze
+instruction adds to the layout that the ARM
+variable should be analyzed with the mean
analysis function
+and the result should be rounded to 1 decimal place. Hence, a
+layout is “pre-data”; that is, it’s a description of
+how to build a table once we get data.
Defining the table layout with a pure rtables
code.
+# Create table layout pure rtables
+lyt <- rtables::basic_table() %>%
+ rtables::split_cols_by(var = "ARM") %>%
+ rtables::split_rows_by(var = "AVISIT") %>%
+ rtables::analyze(vars = "AVAL", mean, format = "xx.x")
Below the only tern
function is
+analyze_vars
which replaces the
+rtables::analyze
function above.
+# Create table layout with tern analyze_vars analyze function
+lyt2 <- rtables::basic_table() %>%
+ rtables::split_cols_by(var = "ARM") %>%
+ rtables::split_rows_by(var = "AVISIT") %>%
+ tern::analyze_vars(vars = "AVAL", .formats = c("mean_sd" = "(xx.xx, xx.xx)"))
+# Apply table layout to data and produce `rtables` object
+
+adrs <- formatters::ex_adrs
+
+rtables::build_table(lyt, df = adrs)
+#> A: Drug X B: Placebo C: Combination
+#> ——————————————————————————————————————————————————————————
+#> SCREENING
+#> mean 3.0 3.0 3.0
+#> BASELINE
+#> mean 2.5 2.8 2.5
+#> END OF INDUCTION
+#> mean 1.7 2.1 1.6
+#> FOLLOW UP
+#> mean 2.2 2.9 2.0
+rtables::build_table(lyt2, df = adrs)
+#> A: Drug X B: Placebo C: Combination
+#> ———————————————————————————————————————————————————————————————
+#> SCREENING
+#> n 154 178 144
+#> Mean (SD) (3.00, 0.00) (3.00, 0.00) (3.00, 0.00)
+#> Median 3.0 3.0 3.0
+#> Min - Max 3.0 - 3.0 3.0 - 3.0 3.0 - 3.0
+#> BASELINE
+#> n 136 146 124
+#> Mean (SD) (2.46, 0.88) (2.77, 1.00) (2.46, 1.08)
+#> Median 3.0 3.0 3.0
+#> Min - Max 1.0 - 4.0 1.0 - 5.0 1.0 - 5.0
+#> END OF INDUCTION
+#> n 218 205 217
+#> Mean (SD) (1.75, 0.90) (2.14, 1.28) (1.65, 1.06)
+#> Median 2.0 2.0 1.0
+#> Min - Max 1.0 - 4.0 1.0 - 5.0 1.0 - 5.0
+#> FOLLOW UP
+#> n 164 153 167
+#> Mean (SD) (2.23, 1.26) (2.89, 1.29) (1.97, 1.01)
+#> Median 2.0 4.0 2.0
+#> Min - Max 1.0 - 4.0 1.0 - 4.0 1.0 - 4.0
We see that tern
offers advanced analysis by extending
+rtables
function calls with only one additional function
+call.
More examples with tabulation analyze functions are presented
+in the Tabulation
vignette.
Clinical trial related plots complement the rich palette of
+tern
tabulation analysis functions. Thus the
+tern
package delivers a full-featured tool for clinical
+trial reporting. The tern
plot functions return
+ggplot2
or gTree
objects, the latter is
+returned when a table is attached to the plot.
+adsl <- formatters::ex_adsl
+adlb <- formatters::ex_adlb
+adlb <- dplyr::filter(adlb, PARAMCD == "ALT", AVISIT != "SCREENING")
The optional nestcolor
package can be loaded in to apply
+the standardized NEST color palette to all tern
plots.
Line plot without a table generated by the
+tern::g_lineplot
function.
+# Mean with CI
+tern::g_lineplot(adlb, adsl, subtitle = "Laboratory Test:")
Line plot with a table generated by the tern::g_lineplot
+function.
+# Mean with CI, table and customized confidence level
+tern::g_lineplot(
+ adlb,
+ adsl,
+ table = c("n", "mean", "mean_ci"),
+ title = "Plot of Mean and 80% Confidence Limits by Visit"
+)
The first plot is a ggplot2
object and the second plot
+is a gTree
object, as the latter contains the table. The
+second plot has to be properly resized to get a clear and readable table
+content.
The tern
functions used for plot generation are mostly
+g_
prefixed. All tern
plot functions are
+listed on the
+tern website functions reference.
Most of tern
outputs could be easily accommodated into
+shiny
apps. We recommend applying tern
outputs
+into teal
apps. The teal
+package is a shiny-based interactive exploration framework for
+analyzing data. teal
shiny apps with tern
+outputs are available in the teal.modules.clinical
+package.
In summary, tern
contains many additional functions for
+creating tables, listing and graphs used in clinical trials and other
+statistical analyses. The design of the package gives users a lot of
+flexibility to meet the analysis needs in a regulatory or exploratory
+reporting context.
For more information please explore the tern +website.
+tern
Formatting Functions Overview
+The tern
R package provides functions to create common
+analyses from clinical trials in R
and these functions have
+default formatting arguments for displaying the values in the output a
+specific way.
tern
formatting differs compared to the formatting
+available in the formatters
package as tern
+formats are capable of handling logical statements, allowing for more
+fine-tuning of the output displayed. Depending on what type of value is
+being displayed, and what that value is, the format of the output will
+change. Whereas when using the formatters
package, the
+specified format is applied regardless of the value.
To see the available formatting functions available in
+tern
see ?formatting_functions
. To see the
+available format strings available in formatters
see
+formatters::list_valid_format_labels()
.
tern
& formatters
+Formats
+The packages used in this vignette are:
+ +The example below demonstrates the use of tern
+formatting in the count_abnormal()
function. The example
+“low” category has a non-zero numerator value so both a fraction and a
+percentage value are displayed, while the “high” value has a numerator
+value of zero and so the fraction value is displayed without also
+displaying the redundant zero percentage value.
+df2 <- data.frame(
+ ID = as.character(c(1, 1, 2, 2)),
+ RANGE = factor(c("NORMAL", "LOW", "HIGH", "LOW")),
+ BL_RANGE = factor(c("NORMAL", "NORMAL", "HIGH", "HIGH")),
+ ONTRTFL = c("", "Y", "", "Y"),
+ stringsAsFactors = FALSE
+)
+
+df2 <- df2 %>%
+ filter(ONTRTFL == "Y")
+
+basic_table() %>%
+ count_abnormal(
+ var = "RANGE",
+ abnormal = list(low = "LOW", high = "HIGH"),
+ variables = list(id = "ID", baseline = "BL_RANGE"),
+ exclude_base_abn = FALSE,
+ .formats = list(fraction = format_fraction)
+ ) %>%
+ build_table(df2)
+#> all obs
+#> —————————————————
+#> low 2/2 (100%)
+#> high 0/2
In the following example the count_abnormal()
function
+is utilized again. This time both “low” values and “high” values have a
+non-zero numerator and so both show a percentage.
+df2 <- data.frame(
+ ID = as.character(c(1, 1, 2, 2)),
+ RANGE = factor(c("NORMAL", "LOW", "HIGH", "HIGH")),
+ BL_RANGE = factor(c("NORMAL", "NORMAL", "HIGH", "HIGH")),
+ ONTRTFL = c("", "Y", "", "Y"),
+ stringsAsFactors = FALSE
+)
+
+df2 <- df2 %>%
+ filter(ONTRTFL == "Y")
+
+basic_table() %>%
+ count_abnormal(
+ var = "RANGE",
+ abnormal = list(low = "LOW", high = "HIGH"),
+ variables = list(id = "ID", baseline = "BL_RANGE"),
+ exclude_base_abn = FALSE,
+ .formats = list(fraction = format_fraction)
+ ) %>%
+ build_table(df2)
+#> all obs
+#> ————————————————
+#> low 1/2 (50%)
+#> high 1/2 (50%)
The following example demonstrates the difference when
+formatters
is used instead to format the output. Here we
+choose to use "xx / xx"
as our value format. The “high”
+value has a zero numerator value and the “low” value has a non-zero
+numerator, yet both are displayed in the same format.
+df2 <- data.frame(
+ ID = as.character(c(1, 1, 2, 2)),
+ RANGE = factor(c("NORMAL", "LOW", "HIGH", "LOW")),
+ BL_RANGE = factor(c("NORMAL", "NORMAL", "HIGH", "HIGH")),
+ ONTRTFL = c("", "Y", "", "Y"),
+ stringsAsFactors = FALSE
+)
+df2 <- df2 %>%
+ filter(ONTRTFL == "Y")
+
+basic_table() %>%
+ count_abnormal(
+ var = "RANGE",
+ abnormal = list(low = "LOW", high = "HIGH"),
+ variables = list(id = "ID", baseline = "BL_RANGE"),
+ exclude_base_abn = FALSE,
+ .formats = list(fraction = "xx / xx")
+ ) %>%
+ build_table(df2)
+#> all obs
+#> ——————————————
+#> low 2 / 2
+#> high 0 / 2
The same concept occurs when using any of the available formats from
+the formatters
package. The following example displays the
+same result using the "xx.x / xx.x"
format instead. Use
+formatters::list_valid_format_labels()
to see the full list
+of available formats in formatters
.
+df2 <- data.frame(
+ ID = as.character(c(1, 1, 2, 2)),
+ RANGE = factor(c("NORMAL", "LOW", "HIGH", "LOW")),
+ BL_RANGE = factor(c("NORMAL", "NORMAL", "HIGH", "HIGH")),
+ ONTRTFL = c("", "Y", "", "Y"),
+ stringsAsFactors = FALSE
+)
+df2 <- df2 %>%
+ filter(ONTRTFL == "Y")
+
+basic_table() %>%
+ count_abnormal(
+ var = "RANGE",
+ abnormal = list(low = "LOW", high = "HIGH"),
+ variables = list(id = "ID", baseline = "BL_RANGE"),
+ exclude_base_abn = FALSE,
+ .formats = list(fraction = "xx.x / xx.x")
+ ) %>%
+ build_table(df2)
+#> all obs
+#> ————————————————
+#> low 2.0 / 2.0
+#> high 0.0 / 2.0
Current tern
formatting functions consider some of the
+following aspects when setting custom behaviors:
NA
.tern
fraction formatting
+functions will exclude the accompanying percentage value.Two functions that set a fixed number of decimal places (specifically
+1) are format_fraction_fixed_dp()
and
+format_count_fraction_fixed_dp()
. By default, formatting
+functions will remove trailing zeros, but these two functions will
+always have one decimal place in their percentage, even if the digit is
+a zero. See the following example:
+format_fraction_fixed_dp(x = c(num = 1L, denom = 3L))
+#> [1] "1/3 (33.3%)"
+format_fraction_fixed_dp(x = c(num = 1L, denom = 2L))
+#> [1] "1/2 (50.0%)"
+
+format_count_fraction_fixed_dp(x = c(2, 0.6667))
+#> [1] "2 (66.7%)"
+format_count_fraction_fixed_dp(x = c(2, 0.25))
+#> [1] "2 (25.0%)"
Functions that set custom values according to a certain threshold
+include format_extreme_values()
,
+format_extreme_values_ci()
, and
+format_fraction_threshold()
. The extreme value formats work
+similarly to allow the user to specify the maximum number of digits to
+include, and very large or very small values are given a special string
+value. For example:
+extreme_format <- format_extreme_values(digits = 2)
+extreme_format(0.235)
+#> [1] "0.23"
+extreme_format(0.001)
+#> [1] "<0.01"
+extreme_format(Inf)
+#> [1] ">999.99"
The format_fraction_threshold()
function allows the user
+to specify a lower percentage threshold, below which values are instead
+assigned a special string value. For example:
+fraction_format <- format_fraction_threshold(0.05)
+fraction_format(x = c(20, 0.1))
+#> [1] 10
+fraction_format(x = c(2, 0.01))
+#> [1] "<5"
See the documentation on each function for specific details on their +behavior and how to customize them.
+If your table requires customized output that cannot be displayed
+using one of the pre-existing tern
formatting functions,
+you may want to consider creating a new formatting function. When
+creating your own formatting function it is important to consider the
+aspects listed in the Formatting Function Customization section
+above.
In this section we will create a custom formatting function derived
+from the format_fraction_fixed_dp()
function. First we will
+take a look at this function in detail and then we will customize
+it.
+# First we will see how the format_fraction_fixed_dp code works and displays the outputs
+format_fraction_fixed_dp <- function(x, ...) {
+ attr(x, "label") <- NULL
+ checkmate::assert_vector(x)
+ checkmate::assert_count(x["num"])
+ checkmate::assert_count(x["denom"])
+
+ result <- if (x["num"] == 0) {
+ paste0(x["num"], "/", x["denom"])
+ } else {
+ paste0(
+ x["num"], "/", x["denom"],
+ " (", sprintf("%.1f", round(x["num"] / x["denom"] * 100, 1)), "%)"
+ )
+ }
+ return(result)
+}
Here we see that if the numerator value is greater than 0, the +fraction and percentage is displayed. If the numerator is 0, only the +fraction is shown. Percent values always display 1 decimal place. Below +we will create a dummy dataset and then observe the output value +behavior when this formatting function is applied.
+
+df2 <- data.frame(
+ ID = as.character(c(1, 1, 2, 2)),
+ RANGE = factor(c("NORMAL", "LOW", "HIGH", "LOW")),
+ BL_RANGE = factor(c("NORMAL", "NORMAL", "HIGH", "HIGH")),
+ ONTRTFL = c("", "Y", "", "Y"),
+ stringsAsFactors = FALSE
+) %>%
+ filter(ONTRTFL == "Y")
+
+basic_table() %>%
+ count_abnormal(
+ var = "RANGE",
+ abnormal = list(low = "LOW", high = "HIGH"),
+ variables = list(id = "ID", baseline = "BL_RANGE"),
+ exclude_base_abn = FALSE,
+ .formats = list(fraction = format_fraction_fixed_dp)
+ ) %>%
+ build_table(df2)
+#> all obs
+#> ———————————————————
+#> low 2/2 (100.0%)
+#> high 0/2
Now we will modify this function to make our custom formatting
+function, custom_format
. We want to display 3 decimal
+places in the percent value, and if the numerator value is 0 we only
+want to display a 0 value (without the denominator).
+custom_format <- function(x, ...) {
+ attr(x, "label") <- NULL
+ checkmate::assert_vector(x)
+ checkmate::assert_count(x["num"])
+ checkmate::assert_count(x["denom"])
+
+ result <- if (x["num"] == 0) {
+ paste0(x["num"]) # We remove the denominator on this line so that only a 0 is displayed
+ } else {
+ paste0(
+ x["num"], "/", x["denom"],
+ " (", sprintf("%.3f", round(x["num"] / x["denom"] * 100, 1)), "%)" # We include 3 decimal places with %.3f
+ )
+ }
+ return(result)
+}
+
+basic_table() %>%
+ count_abnormal(
+ var = "RANGE",
+ abnormal = list(low = "LOW", high = "HIGH"),
+ variables = list(id = "ID", baseline = "BL_RANGE"),
+ exclude_base_abn = FALSE,
+ .formats = list(fraction = custom_format) # Here we implement our new custom_format function
+ ) %>%
+ build_table(df2)
+#> all obs
+#> —————————————————————
+#> low 2/2 (100.000%)
+#> high 0
Each tern
analysis function has pre-specified default
+format functions to implement when generating output, some of which are
+taken from the formatters
package and some of which are
+custom formatting functions stored in tern
. These
+tern
functions differ compared to those from
+formatters
in that logical statements can be used to set
+value-dependent customized formats. If you would like to create your own
+custom formatting function to use with tern
, be sure to
+carefully consider which rules you want to implement to handle different
+input values.