dat <- read.csv("misc/LecXX-Datasets/KeysarEtAl2000.csv")
saveRDS(dat, "kbbb.rds")
- Random intercept is needed whenever there are multiple observations per unit
- Any within-unit factor gets a random slope, unless there is only one observation per level per unit
- Between-unit factors do not get a random slope
- For each interaction, include a slope for the highest order combination of within-subject factors subsumed by the interaction
- For time-series data, include random slopes for time predictors if you have more than one time series per unit
Keysar, B., Barr, D. J., Balin, J. A., & Brauner, J. S. (2000). Taking perspective in conversation: The role of mutual knowledge in comprehension. Psychological Science, 11, 32–38.
- When interpreting expressions e.g. the small candle, do listeners experience interference?
- 20 subjects, 12 items for each subject (N=240)
- one factor: condition (competitor vs. noncompetitor)
- data recorded using a 60 Hz eyetracker
- DV: latency to fixate the target object, measured from onset of the critical word
Field | Description |
---|---|
SubjID | Subject identifier (N=20) |
cond | Condition (E=competitor, C=noncompetitor) |
crit | Moment of onset of critical word (frames) |
targfix | Moment the hand touched the target (frames) |
object | Name of the experimental item |
- load the data from
KeysarEtAl2000.rds
into dataframedat
- calculate
tfix
, the latency for touching the target in milliseconds, store this in the dataframedat
- make histogram of
tfix
- create “truncated” versions of
tfix
,tfixTrunc
, truncating values higher than the 97.5th percentile - calculate means in each condition
- Now do the analysis using model comparison in a linear mixed
effects model, with maximal random effects
- tip: use deviation coding for condition
- Derive \(p\)-values using:
- Wald \(z\) statistic (“t-as-z”)
- Likelihood ratio tests
- Would you say that subjects or items introduce more overall variation in grand mean target latencies?
- Do subjects or items vary more in terms of the effect of condition (competitor)?
- Look at the BLUPS.
- Which items show the effect most strongly?
- Which subjects?
- Do all subjects show the effect?
- Do all items show the effect?
dat <- readRDS("kbbb.rds")
# calculate latencies
dat$tfix <- 1000*((dat$targfix - dat$crit) / 60)
hist(dat$tfix)
# truncate outliers at 97.5th percentile of distribution
cutoff.tfix <- quantile(dat$tfix, probs=.975, na.rm=TRUE)
dat$tfixTrunc <- ifelse(dat$tfix>cutoff.tfix, cutoff.tfix, dat$tfix)
hist(dat$tfixTrunc)
# aggregate
aggregate(tfixTrunc ~ cond, dat, mean)
# re-create data: t-test
dat.subj <- aggregate(tfixTrunc ~ SubjID + cond, dat, mean)
dat.subj <- dat.subj[order(dat.subj$SubjID, dat.subj$cond), ]
dat.t <- t.test(subset(dat.subj, cond=="C")$tfixTrunc,
subset(dat.subj, cond=="E")$tfixTrunc, paired=TRUE)
print(dat.t)
# linear mixed model
# create deviation-coded predictor
dat$D <- dat$cond == "E"
dat$C2 <- dat$D - mean(dat$D)
library("lme4")
mod1 <- lmer(tfixTrunc ~ C2 +
(1 + C2 | SubjID) +
(1 + C2 | object),
data=dat,
subset = complete.cases(dat),
REML=FALSE)
Linear mixed model fit by maximum likelihood ['lmerMod'] Formula: tfixTrunc ~ C2 + (1 + C2 | SubjID) + (1 + C2 | object) Data: dat Subset: complete.cases(dat) AIC BIC logLik deviance df.resid 4421.9 4453.0 -2201.9 4403.9 226 Random effects: Groups Name Variance Std.Dev. Corr SubjID (Intercept) 417282 645.97 C2 758341 870.83 1.00 object (Intercept) 616982 785.48 C2 6765 82.25 1.00 Residual 7236631 2690.10 Number of obs: 235, groups: SubjID, 20; object, 12 Fixed effects: Estimate Std. Error t value (Intercept) 3306.4 321.1 10.296 C2 1439.6 402.2 3.579
paramest <- fixef(mod1)
stderrs <- sqrt(diag(vcov(mod1)))
tstats <- paramest / stderrs
pvals <- 2 * (1 - pnorm(abs(tstats)))
data.frame(b = paramest, se = stderrs, t = tstats, p = pvals)
mod2 <- update(mod1, . ~ . -C2)
anova(mod1, mod2)
chi2 <- deviance(mod2) - deviance(mod1)
pchi <- pchisq(chi2, 1, lower.tail = FALSE)
c(chisq = chi2, p = pchi)
blups <- ranef(mod1)
blups$SubjID$C2 + fixef(mod1)[2] # every subject shows effect
blups$object$C2 + fixef(mod1)[2] # every item shows effect
library("pbkrtest")
mod_kr <- KRmodcomp(mod1, mod2)
summary(mod_kr)
#+RESULTS[a21201dd2704288fcddacb9041b2d895bd722e63]:
F-test with Kenward-Roger approximation; computing time: 1.14 sec. large : tfixTrunc ~ C2 + (1 + C2 | SubjID) + (1 + C2 | object) small : tfixTrunc ~ (1 + C2 | SubjID) + (1 + C2 | object) stat ndf ddf F.scaling p.value Ftest 12.4550 1.0000 8.9771 1 0.006448 ** FtestU 12.4550 1.0000 8.9771 0.006448 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
mod_pb <- PBmodcomp(mod1, mod2)
summary(mod_pb)
#+RESULTS[3013bd824fc4a56fb33c20466b86fd7e9944571c]:
There were 50 or more warnings (use warnings() to see the first 50) Parametric bootstrap test; time: 156.28 sec; samples: 1000 extremes: 0; Requested samples: 1000 Used samples: 996 Extremes: 0 large : tfixTrunc ~ C2 + (1 + C2 | SubjID) + (1 + C2 | object) small : tfixTrunc ~ (1 + C2 | SubjID) + (1 + C2 | object) stat df ddf p.value PBtest 10.539 0.0010030 ** Gamma 10.539 0.0005716 *** Bartlett 11.161 1.000 0.0008356 *** F 10.539 1.000 -33.912 LRT 10.539 1.000 0.0011689 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Warning message: In pf(Fobs, df1 = ndf, df2 = ddf) : NaNs produced