-
Notifications
You must be signed in to change notification settings - Fork 1
/
ch_cmh.Rmd
238 lines (165 loc) · 10.8 KB
/
ch_cmh.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
# Cochran Mantel Haenszel Test
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, cache = TRUE)
library(dplyr)
```
The CMH procedure tests for conditional independence in partial contingency tables for a 2 x 2 x K design. However, it can be generalized to tables of X x Y x K dimensions.
```{r, echo = FALSE, fig.align='center'}
knitr::include_graphics('images/cmh/img.png')
```
## Naming Convention
For the remainder of this document, we adopt the following naming convention when referring to variables of a contingency table:
- X = exposure
- Y = response
- K = control
## Scale
The `scale` of the exposure (X) and response (Y) variables dictate which test statistic is computed for the contingency table. Each test statistic is evaluated on different degrees of freedom (df):
- `General association` statistic (X and Y both nominal) results in `(X-1) * (Y-1) dfs`
- `Row mean` scores statistic (X is nominal and Y is ordinal) results in `X-1 dfs`
- `Nonzero correlation` statistic (X and Y both ordinal) results in `1 df`
## Testing Strategy
### Data
To begin investigating the differences in the SAS and R implementations of the CMH test, we decided to use the CDISC Pilot data set, which is publicly available on the PHUSE Test Data Factory repository. We applied very basic filtering conditions upfront (see below) and this data set served as the basis of the examples to follow.
```{r, eval = FALSE}
# Download CDISC Pilot Dataset
# Perform filtering
library(haven)
read_xpt("https://github.com/phuse-org/TestDataFactory/raw/main/Updated/TDF_ADaM/adcibc.xpt") %>%
filter(EFFFL == 'Y' & ITTFL == 'Y', AVISITN == 8 & ANL01FL=='Y')
```
### Schemes
In order to follow a systematic approach to testing, and to cover variations in the CMH test, we considered the traditional 2 x 2 x K design as well as scenarios where the generalized CMH test is employed (e.g. 5 x 3 x 3).
We present 5 archetype test scenarios that illustrate diverging results, possibly related to sparse data and possibly considered edge cases.
| Number | Schema | Variables | Relevant Test | Description |
|---|---|---|---|---|
| 1 | 2x2x2 | X = TRTP, Y = SEX, K = AGEGR1 | General Association | TRTP and AGEGR1 were limited to two categories, overall the the groups were rather balanced |
| 3| 2x2x3 | X = TRTP, Y = SEX, K = RACE | General Association | Gives back NaN in R because RACE is very imbalanced|
| 6| 2x5x2 | X = TRTP, Y = AVAL, K = SEX | Row Means | Compare Row Means results for R and SAS because Y is ordinal|
| 9| 3x5x17 |X = TRTP, Y = AVAL, K = SITEID | Row Means | SITEID has many strata and provokes sparse groups, AVAL is ordinal, therefore row means statistic applies here, R threw an error |
| 10 | 5x3x3 | X = AVAL, Y = AGEGR1, K = TRTP | Correlation | X and Y are ordinal variables and therefore the correlation statistics has to be taken here |
## Implementation
### CMH in SAS
The cmh test is calculated in SAS using the PROC FREQ procedure. By default, it outputs the chi square statistic, degrees of freedom and p-value for each of the three alternative hypothesis: `general association`, `row means differ`, and `nonzero correlation`. It is up to the statistical analyst or statistician to know which result is appropriate for their analysis.
When the design of the contingency table is 2 x 2 x K (i.e, X == 2 levels, Y == 2 levels, K >= 2 levels), the Mantel-Haenszel Common Odds Ratio (odds ratio estimate, 95% CI, P-value) and the Breslow-Day Test for Homogeneity of the Odds Ratios (chi-square statistic, degrees of freedom, P-value) are also output.
Below is the syntax to conduct a CMH analysis in SAS:
```{r, eval = FALSE}
Proc freq data = filtered_data;
tables K * X * Y / cmh;
* the order of K, X, and Y appearing on the line is important!;
run;
```
### In R
We did not find any R package that delivers all the same measures as SAS at once. Therefore, we tried out multiple packages:
```{r, echo = FALSE, warning=FALSE, message=FALSE}
library(magrittr)
library(reactable)
# table approach inspired by Sharla - https://sharla.party/post/comparing-two-dfs/
yes <- "\U2705"
no <- "\U274C"
tibble::tribble(
~Package, ~`General Association`, ~`Row Means Differ`, ~`Nonzero Correlation`, ~`M-H Odds Ratio`, ~`Homogeneity Test`, ~Note,
"stats::mantelhaen.test()", yes, no, no, yes, no, "Works well for 2x2xK",
"vcdExtra::CMHtest()", yes,yes,yes,no,no,"Problems with sparsity, potential bug",
"epiDisplay::mhor()", no,no,no,yes,yes,"OR are limited to 2x2xK design") %>%
reactable::reactable(fullWidth = TRUE,
defaultColDef = colDef(html = TRUE, align = "center"),
columns = list(Note = colDef(width = 120),
Package = colDef(width = 170))
)
```
#### mantelhaen.test()
This is included in a base installation of R, as part of the stats package. Requires inputting data as a *table* or as *vectors*.
```{r, eval = F}
mantelhaen.test(x = data$x, y = data$y, z = data$k)
```
#### CMHtest()
The vcdExtra package provides results for the generalized CMH test, for each of the three model it outputs the Chi-square value and the respective p-values. Flexible data input methods available: *table* or *formula* (aggregated level data in a data frame).
```{r, eval = FALSE}
library(vcdExtra)
CMHtest(Freq ~ X + Y | K , data=data, overall=TRUE)
```
##### Forked Version - Solution for sparse data
To tackle the [issue with sparse data](https://github.com/friendly/vcdExtra/issues/3) it is recommended that a use of `solve()` is replaced with `MASS::ginv`. This was implemented in the forked version of vcdExtra which can be installed from here:
```{r, eval = FALSE}
devtools::install_github("mstackhouse/vcdExtra")
```
However, also the forked version for the vcdExtra package works only until a certain level of sparsity. In case of our data, it still works if the data are stratified by the pooled Site ID (SITEGR1 - 11 unique values) whereas using the unpooled Site ID (SITEID - 17 unique values) also throws an error.
##### Inconsistent Results
Exploring the vcdExtra package in detail we realized that there seems to be a bug in the implementation which we would like to highlight here.
Using an independent data example (3x2x2), we first ran the CMH test through SAS and obtained the following:
```{r}
# Cochran-Mantel-Haenszel Statistics (Based on Table Scores)
# Statistic Alternative Hypothesis DF Value Prob
# ---------------------------------------------------------------
# 1 Nonzero Correlation 1 6.4586 0.0110
# 2 Row Mean Scores Differ 2 26.0278 <.0001
# 3 General Association 2 26.0278 <.0001
```
```{r, echo = FALSE, warning = FALSE, message = FALSE}
test_eg <- data.frame(x = c('low','low','low','low','med','med','med','med', 'high','high','high','high'),
k = c('yes','yes','no','no', 'yes','yes','no','no','yes','yes','no','no'),
y = c('yes','no', 'yes','no','yes','no','yes','no','yes','no','yes','no'),
Freq = c(11,43,42,169,14,104,20,132,8,196,2,59))
library(vcdExtra)
```
When we specify the analysis in R this way, we get the following results:
```{r}
CMHtest(Freq~x+y|k, data=test_eg, overall=TRUE, details=TRUE)$ALL$table
```
By default, all 3 tests (and a 4th) are output. The results seem to match.
When we specify the analysis using an explicit *type* argument, we get the following results:
```{r}
CMHtest(Freq~x+y|k, data=test_eg, overall=TRUE, details=TRUE, type = "ALL")$ALL$table
```
While we've essentially asked for everything (i.e. the default), albeit explicitly, the results do not match. The degrees of freedom seem to be mismatched with results in wrong p-values (if we assume SAS is the gold standard)
Similarly, if we specify we would *only* like certain tests returned, the results seem dependent on the order of specification:
```{r}
# Order: cor, general, rmeans
CMHtest(Freq~x+y|k, data=test_eg, overall=TRUE, details=TRUE, types=c("cor","general","rmeans"))$ALL$table
```
```{r}
# Order: rmeans, general, cor
CMHtest(Freq~x+y|k, data=test_eg, overall=TRUE, details=TRUE, types=c("rmeans","general","cor"))$ALL$table
```
Impact: In the event of a 2x2xK design, all 3 test statistics, degrees of freedom and p-values are the same. Therefore, specifying a 2x2xK analysis any of the ways above results in correct "answer", though they are technically mixed up.
This is why we chose as a 3x2x2, so we can clearly *where* things are mixed up.
We are planning to submit an issue to the vcdExtra github for the author's review.
#### Epi Display package
To get the M-H common odds ratio and the homogeneity test, the epiDisplay package can be used.
```{r, eval = FALSE}
library(epiDisplay)
mhor(x,y,k, graph = FALSE)
```
## Results
Here the results can be seen:
### CMH Statistics
```{r, echo = F, warning = FALSE}
readRDS('data/cmh/table1b.rds')
```
As it can be seen, there are two schemata where R does not provide any results:
```{r, echo = F}
readRDS('data/cmh/table1a.rds')
```
**Reason for NaN in schema 3**: Stratum k = AMERICAN INDIAN OR ALASKA NATIVE can not be compared because there are only values for one treatment and one gender.
**Reason for Error 4:**
For large sparse table (many strata) CMHTest will occasionally throw an error in solve.default(AVA) because of singularity
### Odds Ratio
Only applicable to 2x2xK designs.
```{r, echo = F}
readRDS('data/cmh/table2.rds')
```
### Homogenity Test
Only applicable to 2x2xK designs.
```{r, echo = F}
readRDS('data/cmh/table3.rds')
```
## Summary and Recommendation
Having explored the available R packages to calculate the CMH statistics, using R can only be recommended if the analysis design is equivalent to 2 x 2 x K. Then, the base mantelhaen.test() function as well as the vcdExtra package show reliable results which are equal to the output of the SAS function. The same is true for the common odds ratio even though there is a marked difference in decimals.
For the generalized version of the cmh test no R package can be recommended so far. SAS and R outputs differ substantially (possibly due to the underlying subroutines or functions) and the vcdExtra package seems to deliver inconsistent results outside the 2 x 2 x K design.
## References
- [Accessible Summary](https://online.stat.psu.edu/stat504/lesson/5/5.4/5.4.5)
- [An Introduction to Categorical Data Analysis 2nd Edition (Agresti)](http://users.stat.ufl.edu/~aa/)
- [SAS Documentation (Specification)](https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.3/statug/statug_freq_examples07.htm)
- [SAS Documentation (Theoretical Basis + Formulas)](https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/procstat/procstat_freq_details92.htm)
- [Original Paper 1](https://doi.org/10.2307%2F3001616)
- [Original Paper 2](https://doi.org/10.1093/jnci/22.4.719)