1324 feature request varying decimal precision in a summary #1356

iaugusty · 2024-11-19T10:43:13Z

Pull Request

This is a draft proposal on how to use customized formats with varying decimal precision.
Any feedback is more than welcome.

For now, only analyze_vars is using this approach, whereas I'd like to at least handle ancova analysis results in a similar way.

…on-in-a_summary

github-actions · 2024-11-19T10:57:08Z

Unit Tests Summary

1 files 84 suites 1m 39s ⏱️
870 tests 867 ✅ 3 💤 0 ❌
2 322 runs 2 294 ✅ 28 💤 0 ❌

Results for commit 648d5b6.

♻️ This comment has been updated with latest results.

github-actions · 2024-11-19T10:57:13Z

Unit Test Performance Difference

Test Suite	$Status$	Time on `main`	$±Time$	$±Tests$	$±Skipped$	$±Failures$	$±Errors$
analyze_vars_in_cols	💔	$2.40$	$+3.48$	$+17$	$-7$	$0$	$0$
count_occurrences	💔	$0.74$	$+1.66$	$+10$	$-8$	$0$	$0$
count_occurrences_by_grade	💔	$1.76$	$+1.11$	$+16$	$-17$	$0$	$0$
summarize_coxreg	💔	$3.81$	$+1.55$	$+13$	$-13$	$0$	$0$
summarize_num_patients	💔	$1.11$	$+1.28$	$+18$	$-16$	$0$	$0$

Additional test case details

Test Suite	$Status$	Time on `main`	$±Time$	Test Case
analyze_vars_in_cols	💔	$0.48$	$+1.57$	summarize_works_with_nested_analyze

Results for commit 9ebc3bc

♻️ This comment has been updated with latest results.

github-actions · 2024-11-19T11:00:12Z

Code Coverage Summary

Filename                                      Stmts    Miss  Cover    Missing
------------------------------------------  -------  ------  -------  ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
R/abnormal_by_baseline.R                         65       0  100.00%
R/abnormal_by_marked.R                           55       5  90.91%   93-97
R/abnormal_by_worst_grade_worsen.R              116       3  97.41%   263-265
R/abnormal_by_worst_grade.R                      60       0  100.00%
R/abnormal.R                                     43       0  100.00%
R/analyze_variables.R                           189      11  94.18%   544, 561-576, 765
R/analyze_vars_in_cols.R                        176      13  92.61%   178, 221, 235-236, 244-252
R/bland_altman.R                                 92       1  98.91%   46
R/combination_function.R                          9       0  100.00%
R/compare_variables.R                            84       2  97.62%   257, 316
R/control_incidence_rate.R                       10       0  100.00%
R/control_logistic.R                              7       0  100.00%
R/control_step.R                                 23       1  95.65%   58
R/control_survival.R                             15       0  100.00%
R/count_cumulative.R                             59       1  98.31%   74
R/count_missed_doses.R                           36       0  100.00%
R/count_occurrences_by_grade.R                  157       2  98.73%   177, 271
R/count_occurrences.R                           116       1  99.14%   120
R/count_patients_events_in_cols.R                67       1  98.51%   60
R/count_patients_with_event.R                    62       1  98.39%   123
R/count_patients_with_flags.R                    95       1  98.95%   134
R/count_values.R                                 27       0  100.00%
R/cox_regression_inter.R                        154       0  100.00%
R/cox_regression.R                              161       0  100.00%
R/coxph.R                                       167       7  95.81%   191-195, 238, 253, 261, 267-268
R/d_pkparam.R                                   406       0  100.00%
R/decorate_grob.R                               113       0  100.00%
R/desctools_binom_diff.R                        621      64  89.69%   53, 88-89, 125-126, 129, 199, 223-232, 264, 266, 286, 290, 294, 298, 353, 356, 359, 362, 422, 430, 439, 444-447, 454, 457, 466, 469, 516-517, 519-520, 522-523, 525-526, 593, 604-616, 620, 663, 676, 680
R/df_explicit_na.R                               30       0  100.00%
R/estimate_multinomial_rsp.R                     50       1  98.00%   65
R/estimate_proportion.R                         205      11  94.63%   83-90, 94, 99, 320, 486
R/fit_rsp_step.R                                 36       0  100.00%
R/fit_survival_step.R                            36       0  100.00%
R/formatting_functions.R                        227      46  79.74%   141, 269-325, 360
R/g_forest.R                                    585      60  89.74%   240, 252-255, 260-261, 275, 277, 287-290, 335-338, 345, 414, 501, 514, 518-519, 524-525, 538, 554, 601, 632, 707, 716, 722, 741, 796-816, 819, 830, 849, 904, 907, 1042-1047
R/g_ipp.R                                       133       0  100.00%
R/g_km.R                                        350      57  83.71%   285-288, 307-309, 363-366, 400, 428, 432-475, 482-486
R/g_lineplot.R                                  260      22  91.54%   204, 378-385, 424-434, 543, 551
R/g_step.R                                       68       1  98.53%   108
R/g_waterfall.R                                  47       0  100.00%
R/h_adsl_adlb_merge_using_worst_flag.R           73       0  100.00%
R/h_biomarkers_subgroups.R                       46       0  100.00%
R/h_cox_regression.R                            110       0  100.00%
R/h_incidence_rate.R                             45       0  100.00%
R/h_km.R                                        509      41  91.94%   137, 189-194, 287, 378, 380-381, 392-394, 413, 420-421, 423-425, 433-435, 460, 465-468, 651-654, 1108-1119
R/h_logistic_regression.R                       468       3  99.36%   203-204, 273
R/h_map_for_count_abnormal.R                     54       0  100.00%
R/h_pkparam_sort.R                               15       0  100.00%
R/h_response_biomarkers_subgroups.R              90      12  86.67%   50-55, 107-112
R/h_response_subgroups.R                        178      18  89.89%   257-270, 329-334
R/h_stack_by_baskets.R                           64       1  98.44%   89
R/h_step.R                                      180       0  100.00%
R/h_survival_biomarkers_subgroups.R              88       6  93.18%   111-116
R/h_survival_duration_subgroups.R               207      18  91.30%   259-271, 336-341
R/imputation_rule.R                              17       0  100.00%
R/incidence_rate.R                               86       7  91.86%   67-72, 152
R/logistic_regression.R                         102       0  100.00%
R/missing_data.R                                 21       3  85.71%   32, 66, 76
R/odds_ratio.R                                  117       0  100.00%
R/prop_diff_test.R                               91       0  100.00%
R/prop_diff.R                                   265      15  94.34%   70-73, 105, 290-297, 440, 605
R/prune_occurrences.R                            57       0  100.00%
R/response_biomarkers_subgroups.R                69       6  91.30%   196-201
R/response_subgroups.R                          213       8  96.24%   100-105, 260-261
R/riskdiff.R                                     65       5  92.31%   102-105, 114
R/rtables_access.R                               38       0  100.00%
R/score_occurrences.R                            20       1  95.00%   124
R/split_cols_by_groups.R                         49       0  100.00%
R/stat.R                                         59       0  100.00%
R/summarize_ancova.R                            106       2  98.11%   183, 188
R/summarize_change.R                             74       1  98.65%   184
R/summarize_colvars.R                            10       0  100.00%
R/summarize_coxreg.R                            172       0  100.00%
R/summarize_glm_count.R                         209       3  98.56%   193-194, 490
R/summarize_num_patients.R                       93       4  95.70%   117-119, 266
R/summarize_patients_exposure_in_cols.R          96       1  98.96%   56
R/survival_biomarkers_subgroups.R                78       6  92.31%   117-122
R/survival_coxph_pairwise.R                      84      12  85.71%   51-52, 64-73
R/survival_duration_subgroups.R                 211       6  97.16%   124-129
R/survival_time.R                               111       0  100.00%
R/survival_timepoint.R                          124      10  91.94%   131-140
R/utils_checkmate.R                              68       0  100.00%
R/utils_default_stats_formats_labels.R          156       0  100.00%
R/utils_factor.R                                109       2  98.17%   84, 302
R/utils_ggplot.R                                110       0  100.00%
R/utils_grid.R                                  126       5  96.03%   164, 279-286
R/utils_rtables.R                               124       9  92.74%   39, 46, 403-404, 526-530
R/utils_split_funs.R                             52       2  96.15%   82, 94
R/utils.R                                       141       7  95.04%   118, 121, 124, 128, 137-138, 332
R/xutils_custom_stats_formats_varying_dp.R      102     102  0.00%    140-395
TOTAL                                         10964     628  94.27%

Diff against main

Filename                                      Stmts    Miss  Cover
------------------------------------------  -------  ------  --------
R/analyze_variables.R                           +14      +9  -4.68%
R/formatting_functions.R                        +44     +44  -19.17%
R/utils_default_stats_formats_labels.R           -1       0  +100.00%
R/xutils_custom_stats_formats_varying_dp.R     +102    +102  +100.00%
TOTAL                                          +159    +155  -1.35%

Results for commit: 648d5b6

Minimum allowed coverage is 80%

♻️ This comment has been updated with latest results

Melkiades · 2024-11-19T15:57:25Z

R/xutils_custom_stats_formats_varying_dp.R

+#'
+#' @details
+#' Currently only available for usage within `a_summary`, `analyze_vars`, and hope to extend to at least `a_ancova` and `summarize_ancova`.
+#' Question to Roche: is there an intention to refactor a_ancova and summarize_ancova to not use make_afun, but same approach of in_rows as in a_summary?


I think questions should not be in the documentation

open an issue/ask us via email

Melkiades · 2024-11-19T15:59:25Z

R/xutils_custom_stats_formats_varying_dp.R

+#'   ) %>%
+#'   build_table(dt2)
+#'
+get_formats_from_stats_custom <- function(stats,


why a new function? _custom should be reserved for local code only

Melkiades · 2024-11-19T16:00:33Z

R/xutils_custom_stats_formats_varying_dp.R

+  if (any(grepl("xx.d", str, fixed = TRUE))){
+    checkmate::assert_integerish(d)
+    str <-
+      gsub("xx.d", paste0("xx.", paste0(rep("x", times = d), collapse = "")), str, fixed = TRUE)


is not better using sprintf directly?

Melkiades · 2024-11-19T16:01:33Z

@iaugusty which issue is this PR addressing?

…on-in-a_summary

iaugusty · 2024-11-21T13:05:13Z

@iaugusty which issue is this PR addressing?

issue #1324:

alternative approach to control decimal precision, different as currently available auto approach
option to include formatting function (where we can control the rounding method - SAS vs R rounding)
option to have approach for defining other format defaults for the various stats than the tern defaults, eg mean (sd), we want to have xx.x (xx.xx) rather than xx.xx (xx.xx)

Melkiades · 2024-11-21T15:30:39Z

R/xutils_custom_stats_formats_varying_dp.R

+#' @param stats (`character`)\cr statistical methods to get defaults for.
+#'
+#' @details
+#' Currently only available for usage within `a_summary`, `analyze_vars`, and hope to extend to at least `a_ancova` and `summarize_ancova`.


Fix all lint error please. Thanks!

Melkiades · 2024-11-21T15:31:56Z

R/xutils_custom_stats_formats_varying_dp.R

+#' get_formats_from_stats_custom(
+#' stats = c("mean_pval", "mean", "sd" ),
+#' formats_in = c("mean" = "xx.xxxx"),
+#' fmts_specs = list(
+#' fmts_df = tern_formats_custom_df(),
+#' fmts_df_var = "default"
+#' ))$fmt


all formatting is wrong here

Melkiades · 2024-11-21T15:33:02Z

R/xutils_custom_stats_formats_varying_dp.R

+  if (!("formatting_function_exclude" %in% ls)) {formatting_function_exclude <- NULL}
+  if (!("formatting_function" %in% ls)) {formatting_function <- NULL}
+  if ("fmts_df_var" %in% ls && fmts_df_var != "default" &&!("fmts_df" %in% ls)) {
+    stop("fmts_df should be added to fmts_specs")}


please run a styler

Melkiades · 2024-11-21T15:36:51Z

R/analyze_variables.R

-  extra_args <- list(.stats = .stats, na.rm = na.rm, na_str = na_str, ...)
+                         .indent_mods = NULL,
+                         # varying precision arguments
+                         fmt_specs = default_fmt_specs) {


Everything should pass by .formats. Having this is in addition is confusing. We can do already everything from .formats at the moment. Hence, I do not think we need this. It could be helpful to take a look again at the automatic formatting that we provide here in {tern}

…on-in-a_summary

Melkiades · 2024-12-12T14:46:50Z

R/analyze_variables.R

+      fmts_specs = fmt_specs
+    )
+    .formats <- .formats_all$fmt
+    .formats_char <- .formats_all$fmt_char


.formats_char is not used?

pawelru · 2024-12-13T09:19:06Z

R/xutils_custom_stats_formats_varying_dp.R

+#' our_custom_fmts <- tern_formats_custom_df()
+#'
+#'
+tern_formats_custom_df <- function(){


if I understood everything correctly, that's the function that you aim to overwrite to provide your custom data.

If the scope is really big (which I am still not quite sure about) - I would propose a configuration object. It would be the best to make it an environment class of the object so that the <- assign operator would work. It should have some meaningful defaults and allow for overwrittes. (Currently it's not very clear to me how the overwrites works in the current approach).
My idea is as follows:

(with defaults)

x <- (...) |> tern::foo(...) |> tern::bar(...)

(if wants to change the defaults)

tern::default_config$formats$foo <- ... tern::default_config$formats$bar <- ... x <- (...) |> tern::foo(...) |> tern::bar(...)

This approach would allow for more customisation in the future.
Please note that this is quite fundamental change to the package structure that would probably require more follow-up changes throughout the package to make it consistent as a whole (if it's working in one place - why this is not working in another). Let's do this only if the scope is really big.

iaugusty added 7 commits October 14, 2024 12:06

draft proposal

3080d20

_ci_3d vars not defined in this branch

3f10c4f

add new formatting function, similar to format_xx

440ecf4

update example

9261ef4

add documentation files

bb7883c

Merge branch 'main' into 1324-feature-request-varying-decimal-precisi…

4bcaa2f

…on-in-a_summary

update after merge from main

a1994fa

iaugusty requested review from Melkiades and edelarua November 19, 2024 10:43

iaugusty linked an issue Nov 19, 2024 that may be closed by this pull request

[Feature Request]: varying decimal precision in a_summary #1324

Open

3 tasks

github-actions bot added 2 commits November 19, 2024 10:45

[skip style] [skip vbump] Restyle files

a8bd470

[skip roxygen] [skip vbump] Roxygen Man Pages Auto Update

52056e4

Melkiades reviewed Nov 19, 2024

View reviewed changes

Melkiades added the sme label Nov 20, 2024

Merge branch 'main' into 1324-feature-request-varying-decimal-precisi…

648d5b6

…on-in-a_summary

Melkiades marked this pull request as draft November 20, 2024 08:41

Melkiades reviewed Nov 21, 2024

View reviewed changes

Merge branch 'main' into 1324-feature-request-varying-decimal-precisi…

84c581c

…on-in-a_summary

Melkiades reviewed Dec 12, 2024

View reviewed changes

pawelru reviewed Dec 13, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1324 feature request varying decimal precision in a summary #1356

1324 feature request varying decimal precision in a summary #1356

iaugusty commented Nov 19, 2024

github-actions bot commented Nov 19, 2024 •

edited

Loading

github-actions bot commented Nov 19, 2024 •

edited

Loading

github-actions bot commented Nov 19, 2024 •

edited

Loading

Melkiades Nov 19, 2024

Melkiades Nov 21, 2024 •

edited

Loading

Melkiades Nov 19, 2024

Melkiades Nov 19, 2024

Melkiades commented Nov 19, 2024

iaugusty commented Nov 21, 2024

Melkiades Nov 21, 2024

Melkiades Nov 21, 2024

Melkiades Nov 21, 2024

Melkiades Nov 21, 2024 •

edited

Loading

Melkiades Dec 12, 2024

pawelru Dec 13, 2024

1324 feature request varying decimal precision in a summary #1356

Are you sure you want to change the base?

1324 feature request varying decimal precision in a summary #1356

Conversation

iaugusty commented Nov 19, 2024

Pull Request

github-actions bot commented Nov 19, 2024 • edited Loading

Unit Tests Summary

github-actions bot commented Nov 19, 2024 • edited Loading

Unit Test Performance Difference

github-actions bot commented Nov 19, 2024 • edited Loading

Code Coverage Summary

Diff against main

Choose a reason for hiding this comment

Melkiades Nov 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Melkiades commented Nov 19, 2024

iaugusty commented Nov 21, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Melkiades Nov 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Nov 19, 2024 •

edited

Loading

github-actions bot commented Nov 19, 2024 •

edited

Loading

github-actions bot commented Nov 19, 2024 •

edited

Loading

Melkiades Nov 21, 2024 •

edited

Loading

Melkiades Nov 21, 2024 •

edited

Loading