Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1324 feature request varying decimal precision in a summary #1356

Draft
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

iaugusty
Copy link
Collaborator

Pull Request

This is a draft proposal on how to use customized formats with varying decimal precision.
Any feedback is more than welcome.

For now, only analyze_vars is using this approach, whereas I'd like to at least handle ancova analysis results in a similar way.

@iaugusty iaugusty linked an issue Nov 19, 2024 that may be closed by this pull request
3 tasks
Copy link
Contributor

github-actions bot commented Nov 19, 2024

Unit Tests Summary

    1 files     84 suites   1m 39s ⏱️
  870 tests   867 ✅  3 💤 0 ❌
2 322 runs  2 294 ✅ 28 💤 0 ❌

Results for commit 648d5b6.

♻️ This comment has been updated with latest results.

Copy link
Contributor

github-actions bot commented Nov 19, 2024

Unit Test Performance Difference

Test Suite $Status$ Time on main $±Time$ $±Tests$ $±Skipped$ $±Failures$ $±Errors$
analyze_vars_in_cols 💔 $2.40$ $+3.48$ $+17$ $-7$ $0$ $0$
count_occurrences 💔 $0.74$ $+1.66$ $+10$ $-8$ $0$ $0$
count_occurrences_by_grade 💔 $1.76$ $+1.11$ $+16$ $-17$ $0$ $0$
summarize_coxreg 💔 $3.81$ $+1.55$ $+13$ $-13$ $0$ $0$
summarize_num_patients 💔 $1.11$ $+1.28$ $+18$ $-16$ $0$ $0$
Additional test case details
Test Suite $Status$ Time on main $±Time$ Test Case
analyze_vars_in_cols 💔 $0.48$ $+1.57$ summarize_works_with_nested_analyze

Results for commit 9ebc3bc

♻️ This comment has been updated with latest results.

Copy link
Contributor

github-actions bot commented Nov 19, 2024

badge

Code Coverage Summary

Filename                                      Stmts    Miss  Cover    Missing
------------------------------------------  -------  ------  -------  ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
R/abnormal_by_baseline.R                         65       0  100.00%
R/abnormal_by_marked.R                           55       5  90.91%   93-97
R/abnormal_by_worst_grade_worsen.R              116       3  97.41%   263-265
R/abnormal_by_worst_grade.R                      60       0  100.00%
R/abnormal.R                                     43       0  100.00%
R/analyze_variables.R                           189      11  94.18%   544, 561-576, 765
R/analyze_vars_in_cols.R                        176      13  92.61%   178, 221, 235-236, 244-252
R/bland_altman.R                                 92       1  98.91%   46
R/combination_function.R                          9       0  100.00%
R/compare_variables.R                            84       2  97.62%   257, 316
R/control_incidence_rate.R                       10       0  100.00%
R/control_logistic.R                              7       0  100.00%
R/control_step.R                                 23       1  95.65%   58
R/control_survival.R                             15       0  100.00%
R/count_cumulative.R                             59       1  98.31%   74
R/count_missed_doses.R                           36       0  100.00%
R/count_occurrences_by_grade.R                  157       2  98.73%   177, 271
R/count_occurrences.R                           116       1  99.14%   120
R/count_patients_events_in_cols.R                67       1  98.51%   60
R/count_patients_with_event.R                    62       1  98.39%   123
R/count_patients_with_flags.R                    95       1  98.95%   134
R/count_values.R                                 27       0  100.00%
R/cox_regression_inter.R                        154       0  100.00%
R/cox_regression.R                              161       0  100.00%
R/coxph.R                                       167       7  95.81%   191-195, 238, 253, 261, 267-268
R/d_pkparam.R                                   406       0  100.00%
R/decorate_grob.R                               113       0  100.00%
R/desctools_binom_diff.R                        621      64  89.69%   53, 88-89, 125-126, 129, 199, 223-232, 264, 266, 286, 290, 294, 298, 353, 356, 359, 362, 422, 430, 439, 444-447, 454, 457, 466, 469, 516-517, 519-520, 522-523, 525-526, 593, 604-616, 620, 663, 676, 680
R/df_explicit_na.R                               30       0  100.00%
R/estimate_multinomial_rsp.R                     50       1  98.00%   65
R/estimate_proportion.R                         205      11  94.63%   83-90, 94, 99, 320, 486
R/fit_rsp_step.R                                 36       0  100.00%
R/fit_survival_step.R                            36       0  100.00%
R/formatting_functions.R                        227      46  79.74%   141, 269-325, 360
R/g_forest.R                                    585      60  89.74%   240, 252-255, 260-261, 275, 277, 287-290, 335-338, 345, 414, 501, 514, 518-519, 524-525, 538, 554, 601, 632, 707, 716, 722, 741, 796-816, 819, 830, 849, 904, 907, 1042-1047
R/g_ipp.R                                       133       0  100.00%
R/g_km.R                                        350      57  83.71%   285-288, 307-309, 363-366, 400, 428, 432-475, 482-486
R/g_lineplot.R                                  260      22  91.54%   204, 378-385, 424-434, 543, 551
R/g_step.R                                       68       1  98.53%   108
R/g_waterfall.R                                  47       0  100.00%
R/h_adsl_adlb_merge_using_worst_flag.R           73       0  100.00%
R/h_biomarkers_subgroups.R                       46       0  100.00%
R/h_cox_regression.R                            110       0  100.00%
R/h_incidence_rate.R                             45       0  100.00%
R/h_km.R                                        509      41  91.94%   137, 189-194, 287, 378, 380-381, 392-394, 413, 420-421, 423-425, 433-435, 460, 465-468, 651-654, 1108-1119
R/h_logistic_regression.R                       468       3  99.36%   203-204, 273
R/h_map_for_count_abnormal.R                     54       0  100.00%
R/h_pkparam_sort.R                               15       0  100.00%
R/h_response_biomarkers_subgroups.R              90      12  86.67%   50-55, 107-112
R/h_response_subgroups.R                        178      18  89.89%   257-270, 329-334
R/h_stack_by_baskets.R                           64       1  98.44%   89
R/h_step.R                                      180       0  100.00%
R/h_survival_biomarkers_subgroups.R              88       6  93.18%   111-116
R/h_survival_duration_subgroups.R               207      18  91.30%   259-271, 336-341
R/imputation_rule.R                              17       0  100.00%
R/incidence_rate.R                               86       7  91.86%   67-72, 152
R/logistic_regression.R                         102       0  100.00%
R/missing_data.R                                 21       3  85.71%   32, 66, 76
R/odds_ratio.R                                  117       0  100.00%
R/prop_diff_test.R                               91       0  100.00%
R/prop_diff.R                                   265      15  94.34%   70-73, 105, 290-297, 440, 605
R/prune_occurrences.R                            57       0  100.00%
R/response_biomarkers_subgroups.R                69       6  91.30%   196-201
R/response_subgroups.R                          213       8  96.24%   100-105, 260-261
R/riskdiff.R                                     65       5  92.31%   102-105, 114
R/rtables_access.R                               38       0  100.00%
R/score_occurrences.R                            20       1  95.00%   124
R/split_cols_by_groups.R                         49       0  100.00%
R/stat.R                                         59       0  100.00%
R/summarize_ancova.R                            106       2  98.11%   183, 188
R/summarize_change.R                             74       1  98.65%   184
R/summarize_colvars.R                            10       0  100.00%
R/summarize_coxreg.R                            172       0  100.00%
R/summarize_glm_count.R                         209       3  98.56%   193-194, 490
R/summarize_num_patients.R                       93       4  95.70%   117-119, 266
R/summarize_patients_exposure_in_cols.R          96       1  98.96%   56
R/survival_biomarkers_subgroups.R                78       6  92.31%   117-122
R/survival_coxph_pairwise.R                      84      12  85.71%   51-52, 64-73
R/survival_duration_subgroups.R                 211       6  97.16%   124-129
R/survival_time.R                               111       0  100.00%
R/survival_timepoint.R                          124      10  91.94%   131-140
R/utils_checkmate.R                              68       0  100.00%
R/utils_default_stats_formats_labels.R          156       0  100.00%
R/utils_factor.R                                109       2  98.17%   84, 302
R/utils_ggplot.R                                110       0  100.00%
R/utils_grid.R                                  126       5  96.03%   164, 279-286
R/utils_rtables.R                               124       9  92.74%   39, 46, 403-404, 526-530
R/utils_split_funs.R                             52       2  96.15%   82, 94
R/utils.R                                       141       7  95.04%   118, 121, 124, 128, 137-138, 332
R/xutils_custom_stats_formats_varying_dp.R      102     102  0.00%    140-395
TOTAL                                         10964     628  94.27%

Diff against main

Filename                                      Stmts    Miss  Cover
------------------------------------------  -------  ------  --------
R/analyze_variables.R                           +14      +9  -4.68%
R/formatting_functions.R                        +44     +44  -19.17%
R/utils_default_stats_formats_labels.R           -1       0  +100.00%
R/xutils_custom_stats_formats_varying_dp.R     +102    +102  +100.00%
TOTAL                                          +159    +155  -1.35%

Results for commit: 648d5b6

Minimum allowed coverage is 80%

♻️ This comment has been updated with latest results

#'
#' @details
#' Currently only available for usage within `a_summary`, `analyze_vars`, and hope to extend to at least `a_ancova` and `summarize_ancova`.
#' Question to Roche: is there an intention to refactor a_ancova and summarize_ancova to not use make_afun, but same approach of in_rows as in a_summary?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think questions should not be in the documentation

Copy link
Contributor

@Melkiades Melkiades Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

open an issue/ask us via email

#' ) %>%
#' build_table(dt2)
#'
get_formats_from_stats_custom <- function(stats,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why a new function? _custom should be reserved for local code only

if (any(grepl("xx.d", str, fixed = TRUE))){
checkmate::assert_integerish(d)
str <-
gsub("xx.d", paste0("xx.", paste0(rep("x", times = d), collapse = "")), str, fixed = TRUE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is not better using sprintf directly?

@Melkiades
Copy link
Contributor

@iaugusty which issue is this PR addressing?

@Melkiades Melkiades added the sme label Nov 20, 2024
@Melkiades Melkiades marked this pull request as draft November 20, 2024 08:41
@iaugusty
Copy link
Collaborator Author

@iaugusty which issue is this PR addressing?

issue #1324:

  • alternative approach to control decimal precision, different as currently available auto approach
  • option to include formatting function (where we can control the rounding method - SAS vs R rounding)
  • option to have approach for defining other format defaults for the various stats than the tern defaults, eg mean (sd), we want to have xx.x (xx.xx) rather than xx.xx (xx.xx)

#' @param stats (`character`)\cr statistical methods to get defaults for.
#'
#' @details
#' Currently only available for usage within `a_summary`, `analyze_vars`, and hope to extend to at least `a_ancova` and `summarize_ancova`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix all lint error please. Thanks!

Comment on lines +89 to +95
#' get_formats_from_stats_custom(
#' stats = c("mean_pval", "mean", "sd" ),
#' formats_in = c("mean" = "xx.xxxx"),
#' fmts_specs = list(
#' fmts_df = tern_formats_custom_df(),
#' fmts_df_var = "default"
#' ))$fmt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all formatting is wrong here

if (!("formatting_function_exclude" %in% ls)) {formatting_function_exclude <- NULL}
if (!("formatting_function" %in% ls)) {formatting_function <- NULL}
if ("fmts_df_var" %in% ls && fmts_df_var != "default" &&!("fmts_df" %in% ls)) {
stop("fmts_df should be added to fmts_specs")}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please run a styler

extra_args <- list(.stats = .stats, na.rm = na.rm, na_str = na_str, ...)
.indent_mods = NULL,
# varying precision arguments
fmt_specs = default_fmt_specs) {
Copy link
Contributor

@Melkiades Melkiades Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything should pass by .formats. Having this is in addition is confusing. We can do already everything from .formats at the moment. Hence, I do not think we need this. It could be helpful to take a look again at the automatic formatting that we provide here in {tern}

fmts_specs = fmt_specs
)
.formats <- .formats_all$fmt
.formats_char <- .formats_all$fmt_char
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.formats_char is not used?

#' our_custom_fmts <- tern_formats_custom_df()
#'
#'
tern_formats_custom_df <- function(){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if I understood everything correctly, that's the function that you aim to overwrite to provide your custom data.

If the scope is really big (which I am still not quite sure about) - I would propose a configuration object. It would be the best to make it an environment class of the object so that the <- assign operator would work. It should have some meaningful defaults and allow for overwrittes. (Currently it's not very clear to me how the overwrites works in the current approach).
My idea is as follows:

(with defaults)

x <- (...) |> tern::foo(...) |> tern::bar(...)

(if wants to change the defaults)

tern::default_config$formats$foo <- ...
tern::default_config$formats$bar <- ...

x <- (...) |> tern::foo(...) |> tern::bar(...)

This approach would allow for more customisation in the future.
Please note that this is quite fundamental change to the package structure that would probably require more follow-up changes throughout the package to make it consistent as a whole (if it's working in one place - why this is not working in another). Let's do this only if the scope is really big.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request]: varying decimal precision in a_summary
4 participants