Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ard_survival_survfit() requires injected formula #223

Closed
ddsjoberg opened this issue Oct 16, 2024 · 4 comments · Fixed by #226
Closed

ard_survival_survfit() requires injected formula #223

ddsjoberg opened this issue Oct 16, 2024 · 4 comments · Fixed by #226
Assignees
Labels

Comments

@ddsjoberg
Copy link
Collaborator

When a saved/stored formula is used in survfit() the call doesn't store the formula's parts. Later when we try to parse the formula to get the stratifying variables, we get a cryptic error message. (See below).

I think we could make the following updates.

  1. Improve the error message, and include an example of how to create a formula with the variable names injected. (Maybe reformulate(). Or would it be possible to have a fallback method for identifying the variables?
  2. Perhaps write a data frame S3 method for ard_survival_survfit.data.frame() where we inject it for them?

@edelarua what do you think?

formula <- survival::Surv(mpg, am) ~ cyl
data <- mtcars

x <- survival::survfit(formula, data = data)

x$call$formula
#> formula
stats::as.formula(x$call$formula)
#> Error in x$formula: object of type 'symbol' is not subsettable
cardx::ard_survival_survfit(x, times = 25)
#> Error in x$formula: object of type 'symbol' is not subsettable

Created on 2024-10-15 with reprex v2.1.1

@edelarua
Copy link
Contributor

Hi @ddsjoberg,

If we replace the line of code here (where terms are extracted from the formula):

x_terms <- attr(stats::terms(stats::as.formula(x$call$formula)), "term.labels")

with an indirect extraction like this (or something similar):

  x_terms <- if ("strata" %in% names(df_stat)) {
    str_remove(str_split(as.character(df_stat$strata)[1], ", ")[[1]], "=.*")
  } else {
    NULL
  }

it works for all the cases I can think up, though its not as clean.

What do you think?

@ddsjoberg
Copy link
Collaborator Author

ddsjoberg commented Oct 21, 2024

I think I prefer terms() over parsing strings, so maybe best to

  1. Improve the error message when the call's formula is a symbol/name instead of a proper formula.
  2. Add a section to the documentation about creating survfit objects with proper formulas in the call.
  3. Consider (not needed immediately) adding a ard_survival_survfit.data.frame() method that will construct the survift object from inputs.

@edelarua
Copy link
Contributor

Good idea!

@edelarua edelarua self-assigned this Oct 21, 2024
@edelarua edelarua added the sme label Oct 21, 2024
@ddsjoberg
Copy link
Collaborator Author

@edelarua if it helps, here's an outline of what the data frame method could look like

library(cardx)

ard_survival_survfit.data.frame <- function(x, y, variables, 
                                            times = NULL, probs = NULL, type = NULL, 
                                            survfit.args = list(conf.int = 0.95), ...) {
  # process outcome as string --------------------------------------------------
  y <- rlang::enquo(y)
  # if a character was passed, return it as it
  if (tryCatch(is.character(rlang::eval_tidy(y)), error = \(e) FALSE)) y <- rlang::eval_tidy(y) # styler: off
  # otherwise, convert expr to string
  else y <- rlang::expr_deparse(rlang::quo_get_expr(y))  # styler: off
  
  # build model ----------------------------------------------------------------
  construct_model(
    data = x,
    formula = stats::reformulate(termlabels = bt(variables), response = y),
    method = "survfit",
    package = "survival",
    method.args = {{ survfit.args }}
  ) |> 
    ard_survival_survfit(times = times, probs = probs, type = type)
}

ard_survival_survfit.data.frame(
  x = mtcars,
  y = "survival::Surv(mpg, am)",
  variables = "vs",
  times = 20,
  survfit.args = list(
    start.time = 0,
    id = cyl # testing NSE arg inputs
  )
)
#> {cards} data frame: 10 x 11
#>    group1 group1_level variable variable_level stat_name stat_label  stat
#> 1      vs            0     time             20    n.risk  Number o…     3
#> 2      vs            0     time             20  estimate  Survival… 0.615
#> 3      vs            0     time             20 std.error  Standard… 0.082
#> 4      vs            0     time             20 conf.high  CI Upper…   0.8
#> 5      vs            0     time             20  conf.low  CI Lower… 0.474
#> 6      vs            1     time             20    n.risk  Number o…    11
#> 7      vs            1     time             20  estimate  Survival…     1
#> 8      vs            1     time             20 std.error  Standard…     0
#> 9      vs            1     time             20 conf.high  CI Upper…     1
#> 10     vs            1     time             20  conf.low  CI Lower…     1
#> ℹ 4 more variables: context, fmt_fn, warning, error

ard_survival_survfit.data.frame(
  x = mtcars,
  y = survival::Surv(mpg, am), # testing unquoted outcome expr
  variables = "vs",
  times = 20,
  survfit.args = list(
    start.time = 0,
    id = cyl
  )
)
#> {cards} data frame: 10 x 11
#>    group1 group1_level variable variable_level stat_name stat_label  stat
#> 1      vs            0     time             20    n.risk  Number o…     3
#> 2      vs            0     time             20  estimate  Survival… 0.615
#> 3      vs            0     time             20 std.error  Standard… 0.082
#> 4      vs            0     time             20 conf.high  CI Upper…   0.8
#> 5      vs            0     time             20  conf.low  CI Lower… 0.474
#> 6      vs            1     time             20    n.risk  Number o…    11
#> 7      vs            1     time             20  estimate  Survival…     1
#> 8      vs            1     time             20 std.error  Standard…     0
#> 9      vs            1     time             20 conf.high  CI Upper…     1
#> 10     vs            1     time             20  conf.low  CI Lower…     1
#> ℹ 4 more variables: context, fmt_fn, warning, error

Created on 2024-10-21 with reprex v2.1.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants