-
-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds data argument to effectsize.htest/cohens_d #522
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #522 +/- ##
==========================================
+ Coverage 90.71% 90.75% +0.04%
==========================================
Files 57 56 -1
Lines 3563 3440 -123
==========================================
- Hits 3232 3122 -110
+ Misses 331 318 -13 ☔ View full report in Codecov by Sentry. |
I haven't given a close look yet, but this seems like it does not cover the full scope of possible See @strengejacke code here: https://github.com/easystats/insight/blob/main/R/utils_get_data.R |
Ok. (When you have time) would it be possible to give me an example of what you mean by "htest data.name patterns"? Do you mean like to support the different htest objects (it wasn't clear to me from the link you provided)? I just added support for cohen's d and t-test for now, but am happy to add support for more if useful. |
|
devtools::load_all("D:/github/forks/effectsize")
#> ℹ Loading effectsize
# t.test
x <- t.test(mpg ~ vs, data = mtcars)
effectsize(x)
#> Warning: Unable to retrieve data from htest object. Returning an approximate
#> effect size using t_to_d().
#> d | 95% CI
#> ----------------------
#> -1.96 | [-2.94, -0.95]
effectsize(x, data = mtcars)
#> Cohen's d | 95% CI
#> --------------------------
#> -1.70 | [-2.55, -0.82]
#>
#> - Estimated using un-pooled SD.
# cor.test
# no need to specify the data argument
x <- cor.test(~ qsec + drat, data = mtcars)
effectsize(x)
#> Warning: This 'htest' method is not (yet?) supported.
#> Returning 'parameters::model_parameters(model)'.
#> Pearson's product-moment correlation
#>
#> Parameter1 | Parameter2 | r | 95% CI | t(30) | p
#> --------------------------------------------------------------
#> qsec | drat | 0.09 | [-0.27, 0.43] | 0.50 | 0.620
#>
#> Alternative hypothesis: true correlation is not equal to 0
# wilcox.test
x <- wilcox.test(mpg ~ vs, data = mtcars)
#> Warning in wilcox.test.default(x = DATA[[1L]], y = DATA[[2L]], ...): cannot
#> compute exact p-value with ties
effectsize(x)
#> Error: Unable to retrieve data from htest object.
#> Try using 'rb()' directly.
effectsize(x, data = mtcars)
#> Warning: 'y' is numeric but has only 2 unique values.
#> If this is a grouping variable, convert it to a factor.
#> r (rank biserial) | 95% CI
#> --------------------------------
#> 1.00 | [1.00, 1.00]
# chisq.test
# No data argument, so it is not possible to use the formula interface.
# fisher.test
# No data argument, so it is not possible to use the formula interface.
# chisq.test_gof
# ???
# mcnemar.test
# No data argument, so it is not possible to use the formula interface.
# friedman.test
wb <- aggregate(warpbreaks$breaks, by = list(
w = warpbreaks$wool, t = warpbreaks$tension), FUN = mean)
wb
#> w t x
#> 1 A L 44.55556
#> 2 B L 28.22222
#> 3 A M 24.00000
#> 4 B M 28.77778
#> 5 A H 24.55556
#> 6 B H 18.77778
x <- friedman.test(x ~ w | t, data = wb)
effectsize(x)
#> Error: Unable to retrieve data from htest object.
#> Try using 'kendalls_w()' directly.
effectsize(x, data = wb)
#> Kendall's W | 95% CI
#> --------------------------
#> 0.11 | [0.11, 1.00]
#>
#> - One-sided CIs: upper bound fixed at [1.00].
# chisq.test_gof
# no need to specify the data argument
# oneway.test
# no need to specify the data argument
x <- oneway.test(extra ~ group, data = sleep, var.equal = TRUE)
effectsize(x)
#> Eta2 | 95% CI
#> -------------------
#> 0.16 | [0.00, 1.00]
#>
#> - One-sided CIs: upper bound fixed at [1.00].
# kruskal.test
# Cannot get it to work
airquality2 <- airquality
airquality2$Month <- as.factor(airquality2$Month)
x <- kruskal.test(Ozone ~ Month, data = airquality2)
effectsize(x)
#> Warning in grepl("Kruskal-Wallis", x$method, fixed = TRUE) &&
#> grepl("^list\\(", : 'length(x) = 2 > 1' in coercion to 'logical(1)'
#> Error: Unable to retrieve data from htest object.
#> Try using 'rank_epsilon_squared()' directly.
effectsize(x, data = airquality2)
#> Warning in grepl("Kruskal-Wallis", x$method, fixed = TRUE) &&
#> grepl("^list\\(", : 'length(x) = 2 > 1' in coercion to 'logical(1)'
#> Warning: Missing values detected. NAs dropped.
#> Epsilon2 (rank) | 95% CI
#> ------------------------------
#> 0.25 | [0.15, 1.00]
#>
#> - One-sided CIs: upper bound fixed at [1.00]. Created on 2022-12-16 with reprex v2.0.2 |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
@rempsyc chisq.test doesn't need any data - it is the only htest that actually contains the data 🥳 To merge this, we would need a lot of unit tests that show that this feature can be used as intended.
library(effectsize)
library(testthat)
tt1 <- t.test(mpg ~ I(am + cyl == 4), data = mtcars)
dd1 <- cohens_d(mpg ~ I(am + cyl == 4), data = mtcars, pooled_sd = FALSE)
dat <- mtcars
tt2 <- t.test(dat$mpg[dat$am==1], dat$mpg[dat$am==0])
dd2 <- cohens_d(dat$mpg[dat$am==1], dat$mpg[dat$am==0], pooled_sd = FALSE)
rm("dat")
col.y <- "mpg"
tt3 <- t.test(mtcars[[col.y]])
dd3 <- cohens_d(mtcars[[col.y]])
rm("col.y")
expect_equal(effectsize(tt1, data = mtcars)[[1]], dd1[[1]])
#> Error in `[.data.frame`(dots$data, , vars) : undefined columns selected
expect_equal(effectsize(tt2, data = mtcars)[[1]], dd2[[1]])
#> Error in `[.data.frame`(dots$data, , vars) : undefined columns selected
expect_equal(effectsize(tt3, data = mtcars)[[1]], dd3[[1]])
#> Error in `[.data.frame`(dots$data, , vars) : undefined columns selected
|
Thanks, that's very useful and specific! I'll look into implementing these suggestions :) |
Doesn’t work well with advanced formula interfaces for now but I return Also, to fix the lints, I was going to change all the |
Are you sure? Below I reproduce your example. Do you think it needs something more? library(effectsize)
some_data <- mtcars
some_data$mpg[1] <- NA
args <- alist(
mpg ~ am,
data = some_data,
alternative = "less",
mu = 1,
pooled_sd = TRUE,
var.equal = TRUE,
subset = cyl == 4,
na.action = na.omit
)
tt <- do.call(args, what = t.test)
d1 <- do.call(args, what = cohens_d)
d2 <- do.call(c(alist(tt), args[-1]), what = effectsize)
all.equal(d1$Cohens_d, d2$Cohens_d)
#> [1] TRUE Created on 2023-08-19 with reprex v2.0.2
I'm happy to start on that first, with your approval. Worst case scenario it's not wasted even if we don't end up merging this PR.
True, but in my experience these arguments are not really used, but even so, we support them if the user really wants them, so I think that's ok.
I think so, with the above example and all new tests I included, I think it adds value for the user. But maybe there is more that I am missing and hopefully you can point it out to me :P |
Without looking at the code, but looking at the tests, this looks pretty good. |
Mattan, there are still some (new) lints, but I can address them with your approval (since it involves changing a lot of variable names that collide with base R functions) |
I still just don't like this as a solution - it requires the user to do so much work that they might as well just re-run the code to not use a formula (all of these tests run in a fraction of a second), no? Does this fix anything upstream (e.g. in report)? If you really think this is advantageous, go a head and make the fixes and merge (: |
Yes, I can understand that, but the way I see it, it adds functionality and does not remove functionality, and yes, as well, it does fix some issues upstream with report :) So will go ahead with the fixes and merge, thanks! |
Failing tests are unrelated to this PR |
I think I need the approval of a reviewer to merge this, if I want to avoid bypassing the branch protections |
Fixes easystats/report#245
Created on 2022-10-16 with reprex v2.0.2