-
Hi~ The first one is about the missing data (dropped trials) and Cronbach's alpha. Just like you mentioned, it may not be appropriate to apply Cronbach's alpha for the trial responses using a scoring algorithm that drops trials, for instance, those with RTs < 200 ms. Do you have any suggestions for calculating Cronbach's alpha in this case? I noticed that in SPPS, they deleted the items completely with missing data. The second question is about calculating the correlations after Monte Carlo splitting. I did it in three steps. First, I used Monte Carto splitting to resample the trials of each condition with replacement by the function (by_split). The third question is about the correlation methods. In your scripts (VPT - Difference of Means), you listed Spearman-Brown adjusted Pearson correlations ( |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 7 replies
-
Hey @ShawGoGo, Thanks for your questions and nice to see you're enjoying the splithalfr! Below I (hopefully accurately) summarize each question and try to give a useful response. (Q1) How do I calculate Cronbach's alpha if I drop trials?I could imagine a couple of approaches, using different definitions of alpha. For instance, coefficient alpha can be expressed in inter-item correlation, or (I suspect) as a SEM model. Calculate the parameters of the model with a method that handles missing data, and you've got a nice case for the equivalence of your coefficient. Via splithalfr, I approached alpha using the Flanagan-Rulon coefficient; see this simulation. You could extend this method by dropping trials before calculating the mean. (Q2). How to I calculate reliability when splitting Monte Carlo?The split seems OK. However, when you split Monte Carlo there is no need to apply a Spearman-Brown adjustment to the correlations, like you need to do for first-second, odd-even, or permutated. That's because the Monte Carlo already produces two parts that are just as long as the task you're splitting ( (Q3). Which coefficient is the best option?I haven't examined this in-depth (the whole splitting thing turned out to be a project in itself), but I can speculate a bit. If you randomize and repeat trials, the three CTT coefficients (Spearman-Brown, Flanagan-Rulon, Angoff-Feldt) tend to have the same values. I think that's because the three coefficients use models with increasing numbers of trial-level parameters, but a random sequence of trials does not offer any information for estimating those parameters. See Warrens (2015). I know of a single paper that sorted trials of a cognitive task in such a way that it made sense to fit a model with trial-level parameters (Green et al., 2016). ICC is different; there are over 6 version of it. Those ICC ideas about consistency and agreement have, as far as I know, not been applied to split-half methods, but it could be really interesting :) Best, Thomas |
Beta Was this translation helpful? Give feedback.
-
Yes, that is what I did for the data set. Thank you very much for the
example. Cheers!
xz
…On Tue, Aug 10, 2021 at 9:17 PM tpronk ***@***.***> wrote:
Two-part coefficients, like Spearman-Brown, Flanagan-Rulon, Angoff-Feldt,
and ICC, are calculated on aggregated scores, not on individual
items/trials. So it's handiest to have a dataset with 50 rows, and columns
for participant, replication, and scores for each participant on each of
the two parts.
Here is an example
library(splithalfr)
# Example data
example_data = data.frame(
participant_id = rep(1 : 50, each = 20),
trial_id = rep(1 : 20, 50),
rt = rnorm(50 * 20)
)
# Example scoring function; receives (split) data from one participant and
# return a score. For example, the mean.
# When we are splitting, the function is called twice; once per split part
example_score = function(ds) {
return (mean(ds$rt))
}
# One Monte Carlo replication
split_scores = by_split(
data = example_data,
participants = example_data$participant_id,
fn_score = example_score,
replications = 1,
method = "random",
replace = TRUE,
split_p = 1,
ncores = 1
)
# split_scores has 50 rows of data; one row per participant and replication
# split_scores has columns score_1 and score_2, these are the scores returned
# by example_score for each of the two parts
# Now we can calculate a coefficient for one replication. Since we're splitting
# Monte Carlo, each part is just as long as the original dataset, so no
# Spearman-Brown adjustment needed
cor(split_scores$score_1, split_scores$score_2)
# Same result as above, but if we'd have multiple replications, split_coef
# would return a vector of correlations; one per replication
split_coefs(split_scores, cor)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#6 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AMD77LHB2AK7OPYEYDTOC7DT4ERFVANCNFSM5BCAI7CQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email>
.
|
Beta Was this translation helpful? Give feedback.
-
@ShawGoGo, a question (and concern) about Monte Carlo came in in this thread. Perhaps interesting to you? |
Beta Was this translation helpful? Give feedback.
Hey @ShawGoGo,
Thanks for your questions and nice to see you're enjoying the splithalfr! Below I (hopefully accurately) summarize each question and try to give a useful response.
(Q1) How do I calculate Cronbach's alpha if I drop trials?
I could imagine a couple of approaches, using different definitions of alpha. For instance, coefficient alpha can be expressed in inter-item correlation, or (I suspect) as a SEM model. Calculate the parameters of the model with a method that handles missing data, and you've got a nice case for the equivalence of your coefficient. Via splithalfr, I approached alpha using the Flanagan-Rulon coefficient; see this simulation. You could extend this method by d…