Skip to content

Commit

Permalink
Improve analyze/summarize function documentation (#1283)
Browse files Browse the repository at this point in the history
Fixes #1130
  • Loading branch information
edelarua authored Sep 5, 2024
1 parent f9fb467 commit b2d1dbc
Show file tree
Hide file tree
Showing 79 changed files with 1,102 additions and 658 deletions.
1 change: 1 addition & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@

### Miscellaneous
* Began deprecation of the confusing functions `summary_formats` and `summary_labels`.
* Enhanced general descriptions of analyze and summarize functions throughout package documentation.

# tern 0.9.5

Expand Down
33 changes: 21 additions & 12 deletions R/abnormal.R
Original file line number Diff line number Diff line change
@@ -1,13 +1,21 @@
#' Patient counts with abnormal range values
#' Count patients with abnormal range values
#'
#' @description `r lifecycle::badge("stable")`
#'
#' Primary analysis variable `.var` indicates the abnormal range result (`character` or `factor`)
#' and additional analysis variables are `id` (`character` or `factor`) and `baseline` (`character` or
#' `factor`). For each direction specified in `abnormal` (e.g. high or low) count patients in the
#' numerator and denominator as follows:
#' * `num` : The number of patients with this abnormality recorded while on treatment.
#' * `denom`: The number of patients with at least one post-baseline assessment.
#' The analyze function [count_abnormal()] creates a layout element to count patients with abnormal analysis range
#' values in each direction.
#'
#' This function analyzes primary analysis variable `var` which indicates abnormal range results.
#' Additional analysis variables that can be supplied as a list via the `variables` parameter are
#' `id` (defaults to `USUBJID`), a variable to indicate unique subject identifiers, and `baseline`
#' (defaults to `BNRIND`), a variable to indicate baseline reference ranges.
#'
#' For each direction specified via the `abnormal` parameter (e.g. High or Low), a fraction of
#' patient counts is returned, with numerator and denominator calculated as follows:
#' * `num`: The number of patients with this abnormality recorded while on treatment.
#' * `denom`: The total number of patients with at least one post-baseline assessment.
#'
#' This function assumes that `df` has been filtered to only include post-baseline records.
#'
#' @inheritParams argument_convention
#' @param abnormal (named `list`)\cr list identifying the abnormal range level(s) in `var`. Defaults to
Expand All @@ -19,11 +27,12 @@
#' to see available statistics for this function.
#'
#' @note
#' * `count_abnormal()` only works with a single variable containing multiple abnormal levels.
#' * `df` should be filtered to include only post-baseline records.
#' * the denominator includes patients that might have other abnormal levels at baseline,
#' and patients with missing baseline. Patients with these abnormalities at
#' baseline can be optionally excluded from numerator and denominator.
#' * `count_abnormal()` only considers a single variable that contains multiple abnormal levels.
#' * `df` should be filtered to only include post-baseline records.
#' * The denominator includes patients that may have other abnormal levels at baseline,
#' and patients missing baseline records. Patients with these abnormalities at
#' baseline can be optionally excluded from numerator and denominator via the
#' `exclude_base_abn` parameter.
#'
#' @name abnormal
#' @include formatting_functions.R
Expand Down
41 changes: 26 additions & 15 deletions R/abnormal_by_baseline.R
Original file line number Diff line number Diff line change
@@ -1,20 +1,31 @@
#' Patient counts with abnormal range values by baseline status
#' Count patients with abnormal analysis range values by baseline status
#'
#' @description `r lifecycle::badge("stable")`
#'
#' Primary analysis variable `.var` indicates the abnormal range result (`character` or `factor`), and additional
#' analysis variables are `id` (`character` or `factor`) and `baseline` (`character` or `factor`). For each
#' direction specified in `abnormal` (e.g. high or low) we condition on baseline range result and count
#' patients in the numerator and denominator as follows:
#' * `Not <Abnormal>`
#' * `denom`: the number of patients without abnormality at baseline (excluding those with missing baseline)
#' * `num`: the number of patients in `denom` who also have at least one abnormality post-baseline
#' * `<Abnormal>`
#' * `denom`: the number of patients with abnormality at baseline
#' * `num`: the number of patients in `denom` who also have at least one abnormality post-baseline
#' The analyze function [count_abnormal_by_baseline()] creates a layout element to count patients with abnormal
#' analysis range values, categorized by baseline status.
#'
#' This function analyzes primary analysis variable `var` which indicates abnormal range results. Additional
#' analysis variables that can be supplied as a list via the `variables` parameter are `id` (defaults to
#' `USUBJID`), a variable to indicate unique subject identifiers, and `baseline` (defaults to `BNRIND`), a
#' variable to indicate baseline reference ranges.
#'
#' For each direction specified via the `abnormal` parameter (e.g. High or Low), we condition on baseline
#' range result and count patients in the numerator and denominator as follows for each of the following
#' categories:
#' * `Not <abnormality>`
#' * `num`: The number of patients without abnormality at baseline (excluding those with missing baseline)
#' and with at least one abnormality post-baseline.
#' * `denom`: The number of patients without abnormality at baseline (excluding those with missing baseline).
#' * `<Abnormality>`
#' * `num`: The number of patients with abnormality as baseline and at least one abnormality post-baseline.
#' * `denom`: The number of patients with abnormality at baseline.
#' * `Total`
#' * `denom`: the number of patients with at least one valid measurement post-baseline
#' * `num`: the number of patients in `denom` who also have at least one abnormality post-baseline
#' * `num`: The number of patients with at least one post-baseline record and at least one abnormality
#' post-baseline.
#' * `denom`: The number of patients with at least one post-baseline record.
#'
#' This function assumes that `df` has been filtered to only include post-baseline records.
#'
#' @inheritParams argument_convention
#' @param abnormal (`character`)\cr values identifying the abnormal range level(s) in `.var`.
Expand All @@ -23,8 +34,8 @@
#'
#' @note
#' * `df` should be filtered to include only post-baseline records.
#' * If the baseline variable or analysis variable contains `NA`, it is expected that `NA` has been
#' conveyed to `na_level` appropriately beforehand with [df_explicit_na()] or [explicit_na()].
#' * If the baseline variable or analysis variable contains `NA` records, it is expected that `df` has been
#' pre-processed using [df_explicit_na()] or [explicit_na()].
#'
#' @seealso Relevant description function [d_count_abnormal_by_baseline()].
#'
Expand Down
30 changes: 22 additions & 8 deletions R/abnormal_by_marked.R
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,28 @@
#'
#' @description `r lifecycle::badge("stable")`
#'
#' Primary analysis variable `.var` indicates whether single, replicated or last marked laboratory
#' abnormality was observed (`factor`). Additional analysis variables are `id` (`character` or `factor`)
#' and `direction` (`factor`) indicating the direction of the abnormality. Denominator is number of
#' patients with at least one valid measurement during the analysis.
#' * For `Single, not last` and `Last or replicated`: Numerator is number of patients
#' with `Single, not last` and `Last or replicated` levels, respectively.
#' * For `Any`: Numerator is the number of patients with either single or
#' replicated marked abnormalities.
#' The analyze function [count_abnormal_by_marked()] creates a layout element to count patients with marked laboratory
#' abnormalities for each direction of abnormality, categorized by parameter value.
#'
#' This function analyzes primary analysis variable `var` which indicates whether a single, replicated,
#' or last marked laboratory abnormality was observed. Levels of `var` to include for each marked lab
#' abnormality (`single` and `last_replicated`) can be supplied via the `category` parameter. Additional
#' analysis variables that can be supplied as a list via the `variables` parameter are `id` (defaults
#' to `USUBJID`), a variable to indicate unique subject identifiers, `param` (defaults to `PARAM`), a
#' variable to indicate parameter values, and `direction` (defaults to `abn_dir`), a variable to indicate
#' abnormality directions.
#'
#' For each combination of `param` and `direction` levels, marked lab abnormality counts are calculated
#' as follows:
#' * `Single, not last` & `Last or replicated`: The number of patients with `Single, not last`
#' and `Last or replicated` values, respectively.
#' * `Any`: The number of patients with either single or replicated marked abnormalities.
#'
#' Fractions are calculated by dividing the above counts by the number of patients with at least one
#' valid measurement recorded during the analysis.
#'
#' Prior to using this function in your table layout you must use [rtables::split_rows_by()] to create two
#' row splits, one on variable `param` and one on variable `direction`.
#'
#' @inheritParams argument_convention
#' @param category (`list`)\cr a list with different marked category names for single
Expand Down
29 changes: 20 additions & 9 deletions R/abnormal_by_worst_grade.R
Original file line number Diff line number Diff line change
@@ -1,20 +1,31 @@
#' Patient counts with the most extreme post-baseline toxicity grade per direction of abnormality
#' Count patients by most extreme post-baseline toxicity grade per direction of abnormality
#'
#' @description `r lifecycle::badge("stable")`
#'
#' Primary analysis variable `.var` indicates the toxicity grade (`factor`), and additional
#' analysis variables are `id` (`character` or `factor`), `param` (`factor`) and `grade_dir` (`factor`).
#' The pre-processing steps are crucial when using this function.
#' For a certain direction (e.g. high or low) this function counts
#' patients in the denominator as number of patients with at least one valid measurement during treatment,
#' and patients in the numerator as follows:
#' * `1` to `4`: Numerator is number of patients with worst grades 1-4 respectively;
#' * `Any`: Numerator is number of patients with at least one abnormality, which means grade is different from 0.
#' The analyze function [count_abnormal_by_worst_grade()] creates a layout element to count patients by highest (worst)
#' analysis toxicity grade post-baseline for each direction, categorized by parameter value.
#'
#' This function analyzes primary analysis variable `var` which indicates toxicity grades. Additional
#' analysis variables that can be supplied as a list via the `variables` parameter are `id` (defaults to
#' `USUBJID`), a variable to indicate unique subject identifiers, `param` (defaults to `PARAM`), a variable
#' to indicate parameter values, and `grade_dir` (defaults to `GRADE_DIR`), a variable to indicate directions
#' (e.g. High or Low) for each toxicity grade supplied in `var`.
#'
#' For each combination of `param` and `grade_dir` levels, patient counts by worst
#' grade are calculated as follows:
#' * `1` to `4`: The number of patients with worst grades 1-4, respectively.
#' * `Any`: The number of patients with at least one abnormality (i.e. grade is not 0).
#'
#' Fractions are calculated by dividing the above counts by the number of patients with at least one
#' valid measurement recorded during treatment.
#'
#' Pre-processing is crucial when using this function and can be done automatically using the
#' [h_adlb_abnormal_by_worst_grade()] helper function. See the description of this function for details on the
#' necessary pre-processing steps.
#'
#' Prior to using this function in your table layout you must use [rtables::split_rows_by()] to create two row
#' splits, one on variable `param` and one on variable `grade_dir`.
#'
#' @inheritParams argument_convention
#' @param .stats (`character`)\cr statistics to select for the table. Run `get_stats("abnormal_by_worst_grade")`
#' to see available statistics for this function.
Expand Down
26 changes: 23 additions & 3 deletions R/abnormal_by_worst_grade_worsen.R
Original file line number Diff line number Diff line change
@@ -1,8 +1,27 @@
#' Patient counts for laboratory events (worsen from baseline) by highest grade post-baseline
#' Count patients with toxicity grades that have worsened from baseline by highest grade post-baseline
#'
#' @description `r lifecycle::badge("stable")`
#'
#' Patient count and fraction for laboratory events (worsen from baseline) shift table.
#' The analyze function [count_abnormal_lab_worsen_by_baseline()] creates a layout element to count patients with
#' analysis toxicity grades which have worsened from baseline, categorized by highest (worst) grade post-baseline.
#'
#' This function analyzes primary analysis variable `var` which indicates analysis toxicity grades. Additional
#' analysis variables that can be supplied as a list via the `variables` parameter are `id` (defaults to `USUBJID`),
#' a variable to indicate unique subject identifiers, `baseline_var` (defaults to `BTOXGR`), a variable to indicate
#' baseline toxicity grades, and `direction_var` (defaults to `GRADDIR`), a variable to indicate toxicity grade
#' directions of interest to include (e.g. `"H"` (high), `"L"` (low), or `"B"` (both)).
#'
#' For the direction(s) specified in `direction_var`, patient counts by worst grade for patients who have
#' worsened from baseline are calculated as follows:
#' * `1` to `4`: The number of patients who have worsened from their baseline grades with worst
#' grades 1-4, respectively.
#' * `Any`: The total number of patients who have worsened from their baseline grades.
#'
#' Fractions are calculated by dividing the above counts by the number of patients who's analysis toxicity grades
#' have worsened from baseline toxicity grades during treatment.
#'
#' Prior to using this function in your table layout you must use [rtables::split_rows_by()] to create a row
#' split on variable `direction_var`.
#'
#' @inheritParams argument_convention
#' @param variables (named `list` of `string`)\cr list of additional analysis variables including:
Expand All @@ -12,7 +31,8 @@
#' @param .stats (`character`)\cr statistics to select for the table. Run `get_stats("abnormal_by_worst_grade_worsen")`
#' to see all available statistics.
#'
#' @seealso Relevant helper functions [h_adlb_worsen()] and [h_worsen_counter()]
#' @seealso Relevant helper functions [h_adlb_worsen()] and [h_worsen_counter()] which are used within
#' [s_count_abnormal_lab_worsen_by_baseline()] to process input data.
#'
#' @name abnormal_by_worst_grade_worsen
#' @order 1
Expand Down
2 changes: 1 addition & 1 deletion R/analyze_colvars_functions.R
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
#' Analyze functions on columns
#' Analyze functions in columns
#'
#' @description
#' These functions are wrappers of [rtables::analyze_colvars()] which apply corresponding `tern`
Expand Down
4 changes: 2 additions & 2 deletions R/analyze_functions.R
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
#' to add an analysis to a given table layout:
#'
#' * [analyze_num_patients()]
#' * [analyze_vars()]
#' * [compare_vars()]
#' * [count_abnormal()]
#' * [count_abnormal_by_baseline()]
Expand All @@ -32,14 +33,13 @@
#' leverage `analyze_colvars` to have the context split in rows and the analysis
#' methods in columns.
#' * [summarize_change()]
#' * [analyze_vars()]
#' * [surv_time()]
#' * [surv_timepoint()]
#' * [test_proportion_diff()]
#'
#' @seealso
#' * [summarize_functions] for functions which are wrappers for [rtables::summarize_row_groups()].
#' * [analyze_colvars_functions] for functions that are wrappers for [rtables::analyze_colvars()].
#' * [summarize_functions] for functions which are wrappers for [rtables::summarize_row_groups()].
#'
#' @name analyze_functions
NULL
10 changes: 5 additions & 5 deletions R/analyze_variables.R
Original file line number Diff line number Diff line change
Expand Up @@ -32,11 +32,11 @@ control_analyze_vars <- function(conf_level = 0.95,
#'
#' @description `r lifecycle::badge("stable")`
#'
#' The analyze function [analyze_vars()] generates a summary of one or more variables, using the S3 generic function
#' [s_summary()] to calculate a list of summary statistics. A list of all available statistics for numeric
#' variables can be viewed by running `get_stats("analyze_vars_numeric")` and for non-numeric variables by running
#' `get_stats("analyze_vars_counts")`. Use the `.stats` parameter to specify the statistics to include in your output
#' summary table.
#' The analyze function [analyze_vars()] creates a layout element to summarize one or more variables, using the S3
#' generic function [s_summary()] to calculate a list of summary statistics. A list of all available statistics for
#' numeric variables can be viewed by running `get_stats("analyze_vars_numeric")` and for non-numeric variables by
#' running `get_stats("analyze_vars_counts")`. Use the `.stats` parameter to specify the statistics to include in your
#' output summary table.
#'
#' @details
#' **Automatic digit formatting:** The number of digits to display can be automatically determined from the analyzed
Expand Down
27 changes: 13 additions & 14 deletions R/analyze_vars_in_cols.R
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
#' Summarize numeric variables in columns
#' Analyze numeric variables in columns
#'
#' @description `r lifecycle::badge("experimental")`
#'
#' Layout-creating function which can be used for creating column-wise summary tables.
#' This function sets the analysis methods as column labels and is a wrapper for
#' [rtables::analyze_colvars()]. It was designed principally for PK tables.
#' The layout-creating function [analyze_vars_in_cols()] creates a layout element to generate a column-wise
#' analysis table.
#'
#' This function sets the analysis methods as column labels and is a wrapper for [rtables::analyze_colvars()].
#' It was designed principally for PK tables.
#'
#' @inheritParams argument_convention
#' @inheritParams rtables::analyze_colvars
Expand Down Expand Up @@ -35,16 +37,13 @@
#' Adding this function to an `rtable` layout will summarize the given variables, arrange the output
#' in columns, and add it to the table layout.
#'
#' @note This is an experimental implementation of [rtables::summarize_row_groups()] and
#' [rtables::analyze_colvars()] that may be subjected to changes as `rtables` extends its
#' support to more complex analysis pipelines on the column space. For the same reasons,
#' we encourage to read the examples carefully and file issues for cases that differ from
#' them.
#'
#' Here `labelstr` behaves differently than usual. If it is not defined (default as `NULL`),
#' row labels are assigned automatically to the split values in case of `rtables::analyze_colvars`
#' (`do_summarize_row_groups = FALSE`, the default), and to the group label for
#' `do_summarize_row_groups = TRUE`.
#' @note
#' * This is an experimental implementation of [rtables::summarize_row_groups()] and [rtables::analyze_colvars()]
#' that may be subjected to changes as `rtables` extends its support to more complex analysis pipelines in the
#' column space. We encourage users to read the examples carefully and file issues for different use cases.
#' * In this function, `labelstr` behaves atypically. If `labelstr = NULL` (the default), row labels are assigned
#' automatically as the split values if `do_summarize_row_groups = FALSE` (the default), and as the group label
#' if `do_summarize_row_groups = TRUE`.
#'
#' @seealso [analyze_vars()], [rtables::analyze_colvars()].
#'
Expand Down
Loading

0 comments on commit b2d1dbc

Please sign in to comment.