Skip to content

Commit

Permalink
Updating ard_proportion_ci() to accept categorical variables (#158)
Browse files Browse the repository at this point in the history
**What changes are proposed in this pull request?**
* The `ard_proportion_ci(value)` argument has been added. Previously,
only binary variables (0/1 or TRUE/FALSE) could be summarized. Now,
binary variables will have the value specified in `value` summarized.
For non-binary, categorical variables, each level is summarized.

``` r
cardx::ard_proportion_ci(
  mtcars, 
  variables = cyl
) |> 
  dplyr::filter(stat_name %in% c("N", "estimate", "conf.low", "conf.high"))
#> {cards} data frame: 12 x 9
#>    variable variable_level   context stat_name stat_label  stat
#> 1       cyl              4 proporti…         N          N    32
#> 2       cyl              4 proporti…  estimate   estimate 0.344
#> 3       cyl              4 proporti…  conf.low   conf.low 0.164
#> 4       cyl              4 proporti… conf.high  conf.high 0.524
#> 5       cyl              6 proporti…         N          N    32
#> 6       cyl              6 proporti…  estimate   estimate 0.219
#> 7       cyl              6 proporti…  conf.low   conf.low  0.06
#> 8       cyl              6 proporti… conf.high  conf.high 0.378
#> 9       cyl              8 proporti…         N          N    32
#> 10      cyl              8 proporti…  estimate   estimate 0.438
#> 11      cyl              8 proporti…  conf.low   conf.low  0.25
#> 12      cyl              8 proporti… conf.high  conf.high 0.625
#> ℹ 3 more variables: fmt_fn, warning, error
```

<sup>Created on 2024-05-25 with [reprex
v2.1.0](https://reprex.tidyverse.org)</sup>

closes #154


--------------------------------------------------------------------------------

Pre-review Checklist (if item does not apply, mark is as complete)
- [x] **All** GitHub Action workflows pass with a ✅
- [x] PR branch has pulled the most recent updates from master branch:
`usethis::pr_merge_main()`
- [x] If a bug was fixed, a unit test was added.
- [x] If a new `ard_*()` function was added, it passes the ARD
structural checks from `cards::check_ard_structure()`.
- [x] If a new `ard_*()` function was added, `set_cli_abort_call()` has
been set.
- [x] If a new `ard_*()` function was added and it depends on another
package (such as, `broom`), `is_pkg_installed("broom", reference_pkg =
"cardx")` has been set in the function call and the following added to
the roxygen comments: `@examplesIf
do.call(asNamespace("cardx")$is_pkg_installed, list(pkg = "broom"",
reference_pkg = "cardx"))`
- [x] Code coverage is suitable for any new functions/features
(generally, 100% coverage for new code): `devtools::test_coverage()`

Reviewer Checklist (if item does not apply, mark is as complete)

- [ ] If a bug was fixed, a unit test was added.
- [ ] Code coverage is suitable for any new functions/features:
`devtools::test_coverage()`

When the branch is ready to be merged:
- [ ] Update `NEWS.md` with the changes from this pull request under the
heading "`# cardx (development version)`". If there is an issue
associated with the pull request, reference it in parentheses at the end
update (see `NEWS.md` for examples).
- [ ] **All** GitHub Action workflows pass with a ✅
- [ ] Approve Pull Request
- [ ] Merge the PR. Please use "Squash and merge" or "Rebase and merge".

---------

Signed-off-by: Daniel Sjoberg <[email protected]>
Co-authored-by: Zelos Zhu <[email protected]>
  • Loading branch information
ddsjoberg and zdz2101 authored May 29, 2024
1 parent ce47f50 commit 9a50ea9
Show file tree
Hide file tree
Showing 18 changed files with 404 additions and 39 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ BugReports: https://github.com/insightsengineering/cardx/issues
Depends:
R (>= 4.1)
Imports:
cards (>= 0.1.0.9026),
cards (>= 0.1.0.9032),
cli (>= 3.6.1),
dplyr (>= 1.1.2),
glue (>= 1.6.2),
Expand Down
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ export(construct_model)
export(contains)
export(ends_with)
export(everything)
export(is_binary)
export(last_col)
export(matches)
export(num_range)
Expand Down
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@ ard_moodtest() -> ard_stats_mood_test()

### New Features

* The `ard_proportion_ci(value)` argument has been added. Previously, only binary variables (0/1 or TRUE/FALSE) could be summarized. When a value is not supplied, each level of the variable is summarized independently. By default, binary variables will have the 1/TRUE level summarized.

* Added the following functions for calculating Analysis Results Data (ARD).
- `ard_stats_aov()` for calculating ANOVA results using `stats::aov()`. (#3)
- `ard_stats_anova()` for calculating ANOVA results using `stats::anova()`. (#12)
Expand Down
91 changes: 85 additions & 6 deletions R/ard_proportion_ci.R
Original file line number Diff line number Diff line change
Expand Up @@ -18,23 +18,35 @@
#' See `?proportion_ci` for details.
#' @param strata,weights,max.iterations arguments passed to `proportion_ci_strat_wilson()`,
#' when `method='strat_wilson'`
#' @param value ([`formula-list-selector`][syntax])\cr
#' function will calculate the CIs for all levels of the variables specified.
#' Use this argument to instead request only a single level by summarized.
#' Default is `list(where(is_binary) ~ 1L, where(is.logical) ~ TRUE)`, where
#' columns coded as `0`/`1` and `TRUE`/`FALSE` will summarize the `1` and `TRUE` levels.
#'
#' @return an ARD data frame
#' @export
#'
#' @examplesIf do.call(asNamespace("cardx")$is_pkg_installed, list(pkg = "broom", reference_pkg = "cardx"))
#' # compute CI for binary variables
#' ard_proportion_ci(mtcars, variables = c(vs, am), method = "wilson")
ard_proportion_ci <- function(data, variables, by = dplyr::group_vars(data),
conf.level = 0.95,
strata,
weights = NULL,
max.iterations = 10,
#'
#' # compute CIs for each level of a categorical variable
#' ard_proportion_ci(mtcars, variables = cyl, method = "jeffreys")
ard_proportion_ci <- function(data,
variables,
by = dplyr::group_vars(data),
method = c(
"waldcc", "wald", "clopper-pearson",
"wilson", "wilsoncc",
"strat_wilson", "strat_wilsoncc",
"agresti-coull", "jeffreys"
)) {
),
conf.level = 0.95,
value = list(where(is_binary) ~ 1L, where(is.logical) ~ TRUE),
strata = NULL,
weights = NULL,
max.iterations = 10) {
set_cli_abort_call()

# check installed packages ---------------------------------------------------
Expand All @@ -47,8 +59,43 @@ ard_proportion_ci <- function(data, variables, by = dplyr::group_vars(data),
cards::process_selectors(data, strata = strata)
check_scalar(strata)
}
cards::process_formula_selectors(
data[variables],
value = value
)

# calculate confidence intervals ---------------------------------------------
map(
variables,
function(variable) {
levels <- .unique_values_sort(data, variable = variable, value = value[[variable]])

.calculate_ard_proportion(
data = .as_dummy(data, variable = variable, levels = levels, by = by, strata = strata),
variables = c(everything(), -all_of(c(by, strata))),
by = all_of(by),
method = method,
conf.level = conf.level,
strata = strata,
weights = weights,
max.iterations = max.iterations
) %>%
# merge in the variable levels
dplyr::left_join(
dplyr::select(., "variable") |>
dplyr::distinct() |>
dplyr::mutate(variable_level = as.list(.env$levels)),
by = "variable"
) |>
# rename variable column
dplyr::mutate(variable = .env$variable) |>
dplyr::relocate("variable_level", .after = "variable")
}
) |>
dplyr::bind_rows()
}

.calculate_ard_proportion <- function(data, variables, by, method, conf.level, strata, weights, max.iterations) {
cards::ard_complex(
data = data,
variables = {{ variables }},
Expand Down Expand Up @@ -85,3 +132,35 @@ ard_proportion_ci <- function(data, variables, by = dplyr::group_vars(data),
context = "proportion_ci"
)
}

.unique_values_sort <- function(data, variable, value = NULL) {
unique_levels <-
# styler: off
if (is.logical(data[[variable]])) c(TRUE, FALSE)
else if (is.factor(data[[variable]])) factor(levels(data[[variable]]), levels = levels(data[[variable]]))
else unique(data[[variable]]) |> sort()
# styler: on

if (!is_empty(value) && !value %in% unique_levels) {
cli::cli_warn(
c("A value of {.code value={.val {value}}} for variable {.val {variable}}
was passed, but is not one of the observed levels: {.val {unique_levels}}.",
i = "This may be an error.",
i = "If value is a valid, convert variable to factor with all levels specified to avoid this message."
)
)
}
if (!is_empty(value)) {
unique_levels <- value
}

unique_levels
}

.as_dummy <- function(data, variable, levels, by, strata) {
# define dummy variables and return tibble
map(levels, ~ data[[variable]] == .x) |>
set_names(paste0("this_is_not_a_column_name_anyone_would_choose_", variable, "_", levels, "...")) %>%
{dplyr::tibble(!!!.)} |> # styler: off
dplyr::bind_cols(data[c(by, strata)])
}
7 changes: 7 additions & 0 deletions R/proportion_ci.R
Original file line number Diff line number Diff line change
Expand Up @@ -362,6 +362,13 @@ proportion_ci_strat_wilson <- function(x,
compact()
}

#' @describeIn proportion_ci Helper to determine if vector is binary (logical or 0/1)
#'
#' @export
is_binary <- function(x) {
is.logical(x) || (is_integerish(x) && is_empty(setdiff(x, c(0, 1, NA))))
}

#' Helper Function for the Estimation of Stratified Quantiles
#'
#' This function wraps the estimation of stratified percentiles when we assume
Expand Down
29 changes: 20 additions & 9 deletions man/ard_proportion_ci.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

21 changes: 21 additions & 0 deletions man/figures/lifecycle-archived.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
21 changes: 21 additions & 0 deletions man/figures/lifecycle-defunct.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
21 changes: 21 additions & 0 deletions man/figures/lifecycle-deprecated.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
21 changes: 21 additions & 0 deletions man/figures/lifecycle-experimental.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
21 changes: 21 additions & 0 deletions man/figures/lifecycle-maturing.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
21 changes: 21 additions & 0 deletions man/figures/lifecycle-questioning.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
21 changes: 21 additions & 0 deletions man/figures/lifecycle-soft-deprecated.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 9a50ea9

Please sign in to comment.