-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue when sorting table using sort_at_path() given cont_n_onecol() #868
Comments
what if you try |
That was what I assumed and tried, but it is not worked, it seems like cont_n_onecol() works at child levels under RACE? |
I made up some codes here: library(rtables)
library(dplyr)
raw_lyt <- basic_table() %>%
split_cols_by("ARM") %>%
split_rows_by("SEX") %>%
summarize_row_groups() %>%
split_rows_by("RACE") %>%
summarize_row_groups() %>%
split_rows_by("STRATA1") %>%
summarize_row_groups() %>%
analyze("AGE")
DM_black <- DM[DM$RACE == "BLACK OR AFRICAN AMERICAN", ]
DM <- rbind(DM, DM_black, DM_black, DM_black, DM_black)
raw_tbl <- build_table(raw_lyt, DM)
coltrimmed <- raw_tbl[, col_counts(raw_tbl) > 0]
pruned <- prune_table(coltrimmed)
pruned
sort_at_path(pruned, path = c("SEX", "*", "RACE", "*", "STRATA1"), cont_n_onecol(3)) I got: A: Drug X B: Placebo C: Combination
——————————————————————————————————————————————————————————————————————————
F 2302 (64.1%) 1544 (50.1%) 1673 (48.1%)
ASIAN 44 (1.2%) 37 (1.2%) 40 (1.2%)
A 15 (0.4%) 14 (0.5%) 15 (0.4%)
Mean 30.40 35.43 37.40
C 13 (0.4%) 10 (0.3%) 15 (0.4%)
Mean 36.92 34.00 33.47
B 16 (0.4%) 13 (0.4%) 10 (0.3%)
Mean 33.75 32.46 33.30
BLACK OR AFRICAN AMERICAN 2250 (62.6%) 1500 (48.7%) 1625 (46.7%)
B 875 (24.4%) 375 (12.2%) 750 (21.6%)
Mean 36.14 29.67 36.33
A 625 (17.4%) 625 (20.3%) 500 (14.4%)
Mean 31.20 28.00 30.75
C 750 (20.9%) 500 (16.2%) 375 (10.8%)
Mean 31.33 34.50 33.00
WHITE 8 (0.2%) 7 (0.2%) 8 (0.2%)
C 2 (0.1%) 3 (0.1%) 4 (0.1%)
Mean 35.50 44.67 38.50
B 4 (0.1%) 1 (0.0%) 3 (0.1%)
Mean 37.00 48.00 34.33
A 2 (0.1%) 3 (0.1%) 1 (0.0%)
Mean 34.00 29.33 35.00
M 1291 (35.9%) 1538 (49.9%) 1804 (51.9%)
ASIAN 35 (1.0%) 31 (1.0%) 44 (1.3%)
A 12 (0.3%) 6 (0.2%) 16 (0.5%)
Mean 34.42 30.33 36.25
C 15 (0.4%) 9 (0.3%) 16 (0.5%)
Mean 35.60 31.89 31.38
B 8 (0.2%) 16 (0.5%) 12 (0.3%)
Mean 34.88 30.94 35.92
BLACK OR AFRICAN AMERICAN 1250 (34.8%) 1500 (48.7%) 1750 (50.3%)
B 375 (10.4%) 375 (12.2%) 750 (21.6%)
Mean 34.33 32.00 31.00
A 125 (3.5%) 250 (8.1%) 500 (14.4%)
Mean 33.00 30.00 36.50
C 750 (20.9%) 875 (28.4%) 500 (14.4%)
Mean 39.67 34.00 36.50
WHITE 6 (0.2%) 7 (0.2%) 10 (0.3%)
A 1 (0.0%) 3 (0.1%) 5 (0.1%)
Mean 45.00 33.33 32.80
C 2 (0.1%) 0 (0.0%) 4 (0.1%)
Mean 44.00 NA 35.00
B 3 (0.1%) 4 (0.1%) 1 (0.0%)
Mean 43.67 36.75 36.00 Yet I expected: M 1291 (35.9%) 1538 (49.9%) 1804 (51.9%) should have gone up to the first row along together with its childs, rather than: F 2302 (64.1%) 1544 (50.1%) 1673 (48.1%) |
you are asking for two sorting at two different levels of the table so you need to do it twice: sort_at_path(pruned, path = c("SEX", "*", "RACE", "*", "STRATA1"), cont_n_onecol(3)) %>%
sort_at_path(path = c("SEX"), cont_n_onecol(3)) |
Thank you @Melkiades , I've detected the issue is resulted in "empty row" added in split_rows_by() in my table, I solved it thru previous closed issue #315 suggested by @gmbecker using "split_rows_by(..., section_div = " ")". |
@Melkiades may I ask if one level share the same total number in cont_n_onecol(1), how can I let it sort it in cont_n_onecol(1) only then by alphabetic, without looking into cont_n_onecol(2)? For example: Ophthalmologicals xx (xx.x) xx (xx.x)
......
......
Local anesthetics 3 (1.8) 3 (2.2)
Local anesthetics 3 (1.8) 3 (2.2)
Antiinflammatory agents and antiinfectives in combination 3 (1.8) 0
Corticosteroids and antiinfectives in combination 3 (1.8) 0
Surgical aids 3 (1.8) 0
Viscoelastic substances 3 (1.8) 0 Yetr I expected: Ophthalmologicals xx (xx.x) xx (xx.x)
......
......
Antiinflammatory agents and antiinfectives in combination 3 (1.8) 0
Corticosteroids and antiinfectives in combination 3 (1.8) 0
Local anesthetics 3 (1.8) 3 (2.2)
Local anesthetics 3 (1.8) 3 (2.2)
Surgical aids 3 (1.8) 0
Viscoelastic substances 3 (1.8) 0 |
There is a complex way in {rtables} but your fastest guess is to reorder alphabetically the factor levels. See how here I changed the order of your example STRATA1: library(rtables)
library(dplyr)
raw_lyt <- basic_table() %>%
split_cols_by("ARM") %>%
split_rows_by("SEX") %>%
summarize_row_groups() %>%
split_rows_by("RACE") %>%
summarize_row_groups() %>%
split_rows_by("STRATA1") %>%
summarize_row_groups() %>%
analyze("AGE")
DM_black <- DM[DM$RACE == "BLACK OR AFRICAN AMERICAN", ]
DM <- rbind(DM, DM_black, DM_black, DM_black, DM_black)
DM <- DM[1:100, ]
DM$STRATA1 <- factor(DM$STRATA1, levels = c("C", "B", "A"))
raw_tbl <- build_table(raw_lyt, DM)
coltrimmed <- raw_tbl[, col_counts(raw_tbl) > 0]
pruned <- prune_table(coltrimmed)
pruned
sort_at_path(pruned, path = c("SEX", "*", "RACE", "*", "STRATA1"), cont_n_onecol(3))
|
So, what I'm doing is a typical ADCM table to summarize patient counts splited by ATC2/ATC3/ATC4 rtables::split_rows_by("ATC2", section_div = " ") %>%
rtables::summarize_row_groups() %>%
rtables::split_rows_by("ATC3") %>%
rtables::summarize_row_groups() %>%
rtables::split_rows_by("ATC4") %>%
rtables::summarize_row_groups() I got issue like: TRT A TRT B
(N=253) (N=252)
—————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————
Antithrombotic Agents 141 (55.6) 144 (57.0)
Antithrombotic agents 141 (55.6) 144 (57.0)
Platelet aggregation inhibitors excl. heparin 137 (54.0) 138 (54.6)
Vitamin k antagonists 5 (1.6) 1 (4.0)
Heparin group 3 (0.8) 9 (3.2)
Direct factor xa inhibitors 3 (0.8) 5 (1.6)
Enzymes 3 (0.8) 3 (0.8)
Direct thrombin inhibitors 0 4 (1.2) I need to have everything sorted decreasingly in TRT A only by ATC2, then ATC3, then ATC4, in the example above, when ATC4 shared a same number, ATC4 cannot be sorted alphabetically, the same to ATC3 and ATC2. How can I address this issue? I tried to factorize ATC2, ATC3 and ATC4 with all unique values as their levels, but it did not work, am I trying in the right direction? |
Specifying the order of the levels should be sufficient as shown in @Melkiades but we can't see the data or the full layout or the sorting call so its so its very difficult to know what is going on in your case |
Thank you @gmbecker , I made up some codes to explain my example: library(formatters)
library(rtables)
library(dplyr)
raw_lyt <- basic_table() %>%
split_cols_by("ARM") %>%
split_rows_by("STRATA1", split_fun = drop_split_levels) %>%
summarize_row_groups() %>%
split_rows_by("CMCAT", split_fun = drop_split_levels) %>%
summarize_row_groups() %>%
split_rows_by("CMDECOD", split_fun = drop_split_levels) %>%
summarize_row_groups()
raw_tbl <- build_table(raw_lyt, ex_adcm)
raw_tbl Then you should be able to have: > raw_tbl
A: Drug X B: Placebo C: Combination
——————————————————————————————————————————————————————————————
A 179 (29.4%) 193 (31.0%) 218 (31.0%)
medcl A 59 (9.7%) 65 (10.5%) 87 (12.4%)
medname A_1/3 26 (4.3%) 19 (3.1%) 25 (3.6%)
medname A_2/3 14 (2.3%) 19 (3.1%) 31 (4.4%)
medname A_3/3 19 (3.1%) 27 (4.3%) 31 (4.4%)
medcl B 82 (13.5%) 82 (13.2%) 87 (12.4%)
medname B_1/4 16 (2.6%) 21 (3.4%) 24 (3.4%)
medname B_2/4 26 (4.3%) 26 (4.2%) 22 (3.1%)
medname B_3/4 20 (3.3%) 19 (3.1%) 17 (2.4%)
medname B_4/4 20 (3.3%) 16 (2.6%) 24 (3.4%)
medcl C 38 (6.2%) 46 (7.4%) 44 (6.3%)
medname C_1/2 17 (2.8%) 20 (3.2%) 22 (3.1%)
medname C_2/2 21 (3.4%) 26 (4.2%) 22 (3.1%)
B 207 (34.0%) 205 (33.0%) 249 (35.4%)
medcl A 74 (12.2%) 67 (10.8%) 76 (10.8%)
medname A_1/3 19 (3.1%) 27 (4.3%) 38 (5.4%)
medname A_2/3 37 (6.1%) 23 (3.7%) 21 (3.0%)
medname A_3/3 18 (3.0%) 17 (2.7%) 17 (2.4%)
medcl B 81 (13.3%) 99 (15.9%) 119 (16.9%)
medname B_1/4 25 (4.1%) 31 (5.0%) 36 (5.1%)
medname B_2/4 18 (3.0%) 29 (4.7%) 29 (4.1%)
medname B_3/4 18 (3.0%) 18 (2.9%) 22 (3.1%)
medname B_4/4 20 (3.3%) 21 (3.4%) 32 (4.6%)
medcl C 52 (8.5%) 39 (6.3%) 54 (7.7%)
medname C_1/2 26 (4.3%) 17 (2.7%) 26 (3.7%)
medname C_2/2 26 (4.3%) 22 (3.5%) 28 (4.0%)
C 223 (36.6%) 224 (36.0%) 236 (33.6%)
medcl A 72 (11.8%) 75 (12.1%) 79 (11.2%)
medname A_1/3 26 (4.3%) 24 (3.9%) 36 (5.1%)
medname A_2/3 25 (4.1%) 29 (4.7%) 27 (3.8%)
medname A_3/3 21 (3.4%) 22 (3.5%) 16 (2.3%)
medcl B 101 (16.6%) 94 (15.1%) 100 (14.2%)
medname B_1/4 34 (5.6%) 30 (4.8%) 23 (3.3%)
medname B_2/4 20 (3.3%) 18 (2.9%) 25 (3.6%)
medname B_3/4 21 (3.4%) 28 (4.5%) 29 (4.1%)
medname B_4/4 26 (4.3%) 18 (2.9%) 23 (3.3%)
medcl C 50 (8.2%) 55 (8.8%) 57 (8.1%)
medname C_1/2 28 (4.6%) 30 (4.8%) 27 (3.8%)
medname C_2/2 22 (3.6%) 25 (4.0%) 30 (4.3%) My need is to sort by STRATA1 decreasingly then alphabetically only in arm "A: Drug X", then hierarchically by CMCAT, then by CMDECOD. A trick part lies in here: A: Drug X B: Placebo C: Combination
——————————————————————————————————————————————————————————————
A 179 (29.4%) 193 (31.0%) 218 (31.0%)
......
medcl B 82 (13.5%) 82 (13.2%) 87 (12.4%)
......
medname B_3/4 20 (3.3%) 19 (3.1%) 17 (2.4%)
medname B_4/4 20 (3.3%) 16 (2.6%) 24 (3.4%) Where the total number of all splits columns of "medname B_4/4" is larger than "medname B_3/4", but since both of them share the same total number in arm "A: Drug X", "medname B_3/4" should be sorted before "medname B_4/4". I hope I explained the issue clearly now, thanks in advance. |
appears to do what you want it to do, IIUC:
This is controlled by the factor level order as @Melkiades pointed to above, which we can see by reversing their order and then doing identical sorting:
which gives us
|
Thank you @gmbecker and @Melkiades , I detected my issue is due to factor levels, after I factorize every single levels alphabetically in every single split row variable, I can prevent sort_at_path sorting table looking into other columns, many many thanks to you! |
Refer from Pruning and Sorting Tables, I happen to have the same need to use:
sort_at_path(pruned, path = c("RACE", "*", "STRATA1"), cont_n_onecol(5))
To generate similar table like:
What if, there are 100 participants as BLACK OR AFRICAN AMERICAN + Female in ARM C: Combination
How can I sort this table in column C: Combination/F by RACE first, move BLACK OR AFRICAN AMERICAN before ASIAN?
The text was updated successfully, but these errors were encountered: