Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Wilcoxon Rank-Sum Tests #25

Merged
merged 14 commits into from
Feb 9, 2024
2 changes: 2 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,9 @@ export("%>%")
export(all_of)
export(any_of)
export(ard_paired_ttest)
export(ard_paired_wilcoxtest)
export(ard_ttest)
export(ard_wilcoxtest)
export(contains)
export(ends_with)
export(everything)
Expand Down
201 changes: 201 additions & 0 deletions R/ard_wilcoxtest.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,201 @@
#' ARD Wilcoxon Rank-Sum Test
#'
#' @description
#' Analysis results data for paired and non-paired Wilcoxon Rank-Sum tests.
#'
#' @param data (`data.frame`)\cr
#' a data frame. See below for details.
#' @param by ([`tidy-select`][dplyr::dplyr_tidy_select])\cr
#' column name to compare by.
#' @param variable ([`tidy-select`][dplyr::dplyr_tidy_select])\cr
#' column name to be compared.
#' @param id ([`tidy-select`][dplyr::dplyr_tidy_select])\cr
#' column name of the subject or participant ID.
#' @param ... arguments passed to `wilcox.test(...)`
#'
#' @return ARD data frame
#' @name ard_wilcoxtest
#'
#' @details
#' For the `ard_wilcoxtest()` function, the data is expected to be one row per subject.
#' The data is passed as `wilcox.test(data[[variable]] ~ data[[by]], paired = FALSE, ...)`.
#'
#' For the `ard_paired_wilcoxtest()` function, the data is expected to be one row
#' per subject per by level. Before the t-test is calculated, the data are
#' reshaped to a wide format to be one row per subject.
#' The data are then passed as
#' `wilcox.test(x = data_wide[[<by level 1>]], y = data_wide[[<by level 2>]], paired = TRUE, ...)`.
Comment on lines +26 to +27
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not allow wrapping in docs, maybe it should be on the same line somehow?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the help files are rendered, it doesn't break the line mid-code. But if you make the window wide enough it does all appear on one line.

image

Am I addressing your comment?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just wondering if it was intentional that the first breaks mid-code while the second does not

#'
#' @examples
#' cards::ADSL |>
#' dplyr::filter(ARM %in% c("Placebo", "Xanomeline High Dose")) |>
#' ard_wilcoxtest(by = "ARM", variable = "AGE")
#'
#' # constructing a paired data set,
#' # where patients receive both treatments
#' cards::ADSL[c("ARM", "AGE")] |>
#' dplyr::filter(ARM %in% c("Placebo", "Xanomeline High Dose")) |>
#' dplyr::mutate(.by = ARM, USUBJID = dplyr::row_number()) |>
#' dplyr::arrange(USUBJID, ARM) |>
#' ard_paired_wilcoxtest(by = ARM, variable = AGE, id = USUBJID)
NULL

#' @rdname ard_wilcoxtest
#' @export
ard_wilcoxtest <- function(data, by, variable, ...) {
# check installed packages ---------------------------------------------------
cards::check_pkg_installed("broom", reference_pkg = "cardx")

# check/process inputs -------------------------------------------------------
check_not_missing(data)
check_not_missing(variable)
check_not_missing(by)
check_class_data_frame(x = data)
data <- dplyr::ungroup(data)
cards::process_selectors(data, by = {{ by }}, variable = {{ variable }})
check_scalar(by)
check_scalar(variable)

# build ARD ------------------------------------------------------------------
.format_wilcoxtest_results(
by = by,
variable = variable,
lst_tidy =
cards::eval_capture_conditions(
stats::wilcox.test(data[[variable]] ~ data[[by]], paired = FALSE, ...) |>
broom::tidy()
),
paired = FALSE,
...
)
}

#' @rdname ard_wilcoxtest
#' @export
ard_paired_wilcoxtest <- function(data, by, variable, id, ...) {
# check installed packages ---------------------------------------------------
cards::check_pkg_installed("broom", reference_pkg = "cardx")

# check/process inputs -------------------------------------------------------
check_not_missing(data)
check_not_missing(variable)
check_not_missing(by)
check_not_missing(id)
check_class_data_frame(x = data)
data <- dplyr::ungroup(data)
cards::process_selectors(data, by = {{ by }}, variable = {{ variable }}, id = {{ id }})
check_scalar(by)
check_scalar(variable)
check_scalar(id)

# build ARD ------------------------------------------------------------------
.format_wilcoxtest_results(
by = by,
variable = variable,
lst_tidy =
cards::eval_capture_conditions({
# adding this reshape inside the eval, so if there is an error it's captured in the ARD object
data_wide <- .paired_data_pivot_wider(data, by = by, variable = variable, id = id)
# perform paried t-test
stats::wilcox.test(x = data_wide[["by1"]], y = data_wide[["by2"]], paired = TRUE, ...) |>
broom::tidy()
}),
paired = TRUE,
...
)
}

#' Convert t-test to ARD
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just found this. It should be wilcox test @ddsjoberg

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh goodness, thank you!

#'
#' @inheritParams cards::tidy_as_ard
#' @inheritParams stats::wilcox.test
#' @param by (`string`)\cr by column name
#' @param variable (`string`)\cr variable column name
#' @param ... passed to `wilcox.test(...)`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing stats::

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's ok to keep it like this, since it's essentially base R

#'
#' @return ARD data frame
#' @keywords internal
#' @examples
#' cardx:::.format_wilcoxtest_results(
#' by = "ARM",
#' variable = "AGE",
#' paired = FALSE,
#' lst_tidy =
#' cards::eval_capture_conditions(
#' stats::wilcox.test(ADSL[["AGE"]] ~ ADSL[["ARM"]], paired = FALSE) |>
#' broom::tidy()
#' )
#' )
.format_wilcoxtest_results <- function(by, variable, lst_tidy, paired, ...) {
# build ARD ------------------------------------------------------------------
ret <-
cards::tidy_as_ard(
lst_tidy = lst_tidy,
tidy_result_names = c("statistic", "p.value", "method", "alternative"),
fun_args_to_record = c(
"mu", "paired", "exact", "correct", "conf.int",
"conf.level", "tol.root", "digits.rank"
),
formals = formals(asNamespace("stats")[["wilcox.test.default"]]),
passed_args = c(list(paired = paired), dots_list(...)),
lst_ard_columns = list(group1 = by, variable = variable, context = "ttest")
)

# add the stat label ---------------------------------------------------------
ret |>
dplyr::left_join(
.df_wilcoxtest_stat_labels(),
by = "stat_name"
) |>
dplyr::mutate(stat_label = dplyr::coalesce(.data$stat_label, .data$stat_name)) |>
cards::tidy_ard_column_order()
}


#' Convert long paired data to wide
#'
#'
#' @param data (`data.frame`)\cr a data frame that is one line per subject per group
#' @param by (`string`)\cr by column name
#' @param variable (`string`)\cr variable column name
#' @param id (`string`)\cr subject id column name
#' @param env (`environment`) used for error messaging. Default is `rlang::caller_env()`
#'
#' @return a wide data frame
#' @keywords internal
#' @examples
#' cards::ADSL[c("ARM", "AGE")] |>
#' dplyr::filter(ARM %in% c("Placebo", "Xanomeline High Dose")) |>
#' dplyr::mutate(.by = ARM, USUBJID = dplyr::row_number()) |>
#' dplyr::arrange(USUBJID, ARM) |>
#' cardx:::.paired_data_pivot_wider(by = "ARM", variable = "AGE", id = "USUBJID")
.paired_data_pivot_wider <- function(data, by, variable, id, env = rlang::caller_env()) {
# check the number of levels before pivoting data to wider format
if (dplyr::n_distinct(data[[by]], na.rm = TRUE) != 2L) {
cli::cli_abort("The {.arg by} argument must have two and only two levels.", call = env)
}

data |>
# arrange data so the first group always appears first
dplyr::arrange(.data[[by]]) |>
tidyr::pivot_wider(
id_cols = all_of(id),
names_from = all_of(by),
values_from = all_of(variable)
) |>
stats::setNames(c(id, "by1", "by2"))
}

.df_wilcoxtest_stat_labels <- function() {
dplyr::tribble(
~stat_name, ~stat_label,
"statistic", "X-squared Statistic",
Melkiades marked this conversation as resolved.
Show resolved Hide resolved
"parameter", "Degrees of Freedom",
"estimate", "Median of the Difference",
"p.value", "p-value",
"conf.low", "CI Lower Bound",
"conf.high", "CI Upper Bound",
"paired", "Paired test",
"conf.level", "CI Confidence Level",
)
}
1 change: 1 addition & 0 deletions _pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,3 +25,4 @@ reference:
- subtitle: "Inference"
- contents:
- ard_ttest
- ard_wilcoxtest
55 changes: 55 additions & 0 deletions man/ard_wilcoxtest.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

40 changes: 40 additions & 0 deletions man/dot-format_wilcoxtest_results.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

13 changes: 12 additions & 1 deletion man/dot-paired_data_pivot_wider.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading