Skip to content

Commit

Permalink
Merge pull request #17 from dalejbarr/master
Browse files Browse the repository at this point in the history
fix broken links to shiny apps
  • Loading branch information
dalejbarr authored Dec 9, 2024
2 parents 9d04a7e + 70cc282 commit 2ae1980
Show file tree
Hide file tree
Showing 62 changed files with 1,183 additions and 2,458 deletions.
3 changes: 1 addition & 2 deletions 02-correlation-regression.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -432,6 +432,5 @@ To close, here are a few implications from the relationship between correlation
## Exercises

```{r cov-app, echo=FALSE, out.width="530px"}
knitr::include_app("https://rstudio-connect.psy.gla.ac.uk/covariance/", height = "480px")
knitr::include_app("https://talklab.psy.gla.ac.uk/app/covariance-site/", height = "480px")
```

2 changes: 1 addition & 1 deletion 04-interactions.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -555,7 +555,7 @@ Note that the $Y$ variable with the dots in the subscripts are means of $Y$, tak

![](images/04-interactions_factorial_app.png)

[Launch this web application](https://rstudio-connect.psy.gla.ac.uk/factorial){target="_blank"} and experiment with factorial designs until you understand the key concepts of main effects and interactions in a factorial design.
[Launch this web application](https://talklab.psy.gla.ac.uk/app/factorial-site/){target="_blank"} and experiment with factorial designs until you understand the key concepts of main effects and interactions in a factorial design.

## Code your own categorical predictors in factorial designs

Expand Down
2 changes: 1 addition & 1 deletion 05-linear-mixed-effects-intro.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -749,5 +749,5 @@ ggplot(sleep2, aes(x = days_deprived, y = Reaction)) +

## Multi-level app

[Try out the multi-level web app](https://rstudio-connect.psy.gla.ac.uk/multilevel){target="_blank"} to sharpen your understanding of the three different approaches to multi-level modeling.
[Try out the multi-level web app](https://talklab.psy.gla.ac.uk/app/multilevel-site/){target="_blank"} to sharpen your understanding of the three different approaches to multi-level modeling.

2 changes: 1 addition & 1 deletion 06-linear-mixed-effects-one.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ One of the main selling points of the general linear models / regression framewo

Let's consider a situation where you are testing the effect of alcohol consumption on simple reaction time (e.g., press a button as fast as you can after a light appears). To keep it simple, let's assume that you have collected data from 14 participants randomly assigned to perform a set of 10 simple RT trials after one of two interventions: drinking a pint of alcohol (treatment condition) or a placebo drink (placebo condition). You have 7 participants in each of the two groups. Note that you would need more than this for a real study.

This [web app](https://rstudio-connect.psy.gla.ac.uk/icc){target="_blank"} presents simulated data from such a study. Subjects P01-P07 are from the placebo condition, while subjects T01-T07 are from the treatment condition. Please stop and have a look!
This [web app](https://talklab.psy.gla.ac.uk/app/icc-site/){target="_blank"} presents simulated data from such a study. Subjects P01-P07 are from the placebo condition, while subjects T01-T07 are from the treatment condition. Please stop and have a look!

If we were going to run a t-test on these data, we would first need to calculate subject means, because otherwise the observations are not independent. You could do this as follows. (If you want to run the code below, you can download sample data from the web app above and save it as `independent_samples.csv`).

Expand Down
2 changes: 1 addition & 1 deletion 07-crossed-random-factors.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -231,7 +231,7 @@ For more technical details about convergence problems and what to do, see `?lme4

## Simulating data with crossed random factors

For these exercises, we will generate simulated data corresponding to an experiment with a single, two-level factor (independent variable) that is within-subjects and between-items. Let's imagine that the experiment involves lexical decisions to a set of words (e.g., is "PINT" a word or nonword?), and the dependent variable is response time (in milliseconds), and the independent variable is word type (noun vs verb). We want to treat both subjects and words as random factors (so that we can generalize to the population of events where subjects encounter words). You can play around with the web app (or [click here to open it in a new window](https://rstudio-connect.psy.gla.ac.uk/crossed){target="_blank"}), which allows you to manipulate the data-generating parameters and see their effect on the data.
For these exercises, we will generate simulated data corresponding to an experiment with a single, two-level factor (independent variable) that is within-subjects and between-items. Let's imagine that the experiment involves lexical decisions to a set of words (e.g., is "PINT" a word or nonword?), and the dependent variable is response time (in milliseconds), and the independent variable is word type (noun vs verb). We want to treat both subjects and words as random factors (so that we can generalize to the population of events where subjects encounter words). You can play around with the web app (or [click here to open it in a new window](https://talklab.psy.gla.ac.uk/app/crossed-site/){target="_blank"}), which allows you to manipulate the data-generating parameters and see their effect on the data.

By now, you should have all the pieces of the puzzle that you need to simulate data from a study with crossed random effects. @Debruine_Barr_2020 provides a more detailed, step-by-step walkthrough of the exercise below.

Expand Down
4 changes: 2 additions & 2 deletions 08-generalized-linear-models.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -151,8 +151,8 @@ $$np(1 - p).$$

The app below allows you to manipulate the intercept and slope of a line in log odds space and to see the projection of the line back into response space. Note the S-shaped ("sigmoidal") shape of the function in the response shape.

```{r logit-app, echo=FALSE, fig.cap="**Logistic regression web app** <https://rstudio-connect.psy.gla.ac.uk/logit>"}
knitr::include_app("https://rstudio-connect.psy.gla.ac.uk/logit", "800px")
```{r logit-app, echo=FALSE, fig.cap="**Logistic regression web app** <https://talklab.psy.gla.ac.uk/app/logit-site/>"}
knitr::include_app("https://talklab.psy.gla.ac.uk/app/logit-site/", "800px")
```

### Estimating logistic regression models in R
Expand Down
10 changes: 5 additions & 5 deletions docs/01-introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ where $Y_i$ is the measured value of the dependent variable for observation $i$,

<div class='webex-solution'><button>See the R code to make the plot</button>

```r
``` r
library("tidyverse") # if needed

set.seed(62)
Expand Down Expand Up @@ -123,14 +123,14 @@ Data simulation inverts this process. You to define the parameters of a model re
Let's look at an example. Let's assume you are interested in the question of whether being a parent of a toddler 'sharpens' your reflexes. If you've ever taken care of a toddler, you know that physical danger always seems imminent—they could fall of the chair they just climbed on, slam their finger in a door, bang their head on the corner of a table, etc.—so you need to be attentive and ready to act fast. You hypothesize that this vigilance will translate into faster response times in other situations where a toddler is not around, such as in a psychological laboratory. So you recruit a set of parents of toddlers to come into the lab. You give each parent the task of pressing a button as quickly as possible in response to a flashing light, and measure their response time (in milliseconds). For each parent, you calculate their mean response time over all trials. We can simulate the mean response time for each of say, 50 parents using the `rnorm()` function in R. But before we do that, we will load in the packages that we need (tidyverse) and set the <a class='glossary' target='_blank' title='A value used to set the initial state of a random number generator.' href='https://psyteachr.github.io/glossary/r#random-seed'>random seed</a> to make sure that you (the reader) get the same random values as me (the author).


```r
``` r
library("tidyverse")

set.seed(2021) # can be any arbitrary integer
```


```r
``` r
parents <- rnorm(n = 50, mean = 480, sd = 40)
```

Expand All @@ -150,14 +150,14 @@ We chose to generate the data using `rnorm()`—a function that generates random
But of course, to test our hypothesis, we need a comparison group, so we define a control group of non-parents. We generate data from this control group in the same way as above, but changing the mean.


```r
``` r
control <- rnorm(n = 50, mean = 500, sd = 40)
```

Let's put them into a <a class='glossary' target='_blank' title='A container for tabular data with some different properties to a data frame' href='https://psyteachr.github.io/glossary/t#tibble'>tibble</a> to make it easier to plot and analyze the data. Each row from this table represents the mean response time from a particular subject.


```r
``` r
dat <- tibble(group = rep(c("parent", "control"), each = 50),
rt = c(parents, control))

Expand Down
Binary file modified docs/01-introduction_files/figure-html/basic-glm-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/01-introduction_files/figure-html/tryptych-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
84 changes: 30 additions & 54 deletions docs/02-correlation-regression.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,15 +29,15 @@ You can create a correlation matrix in R using `base::cor()` or `corrr::correlat
Let's create a correlation matrix to see how it works. Start by loading in the packages we will need.


```r
``` r
library("tidyverse")
library("corrr") # install.packages("corrr") in console if missing
```

We will use the `starwars` dataset, which is a built-in dataset that becomes available after you load the tidyverse package. This dataset has information about various characters that have appeared in the Star Wars film series. Let's look at the correlation between


```r
``` r
starwars %>%
select(height, mass, birth_year) %>%
correlate()
Expand All @@ -61,7 +61,7 @@ starwars %>%
You can look up any bivariate correlation at the intersection of any given row or column. So the correlation between `height` and `mass` is .134, which you can find in row 1, column 2 or row 2, column 1; the values are the same. Note that there are only `choose(3, 2)` = 3 unique bivariate relationships, but each appears twice in the table. We might want to show only the unique pairs. We can do this by appending `corrr::shave()` to our pipeline.


```r
``` r
starwars %>%
select(height, mass, birth_year) %>%
correlate() %>%
Expand All @@ -86,7 +86,7 @@ starwars %>%
Now we've only got the lower triangle of the correlation matrix, but the `NA` values are ugly and so are the leading zeroes. The **`corrr`** package also provides the `fashion()` function that cleans things up (see `?corrr::fashion` for more options).


```r
``` r
starwars %>%
select(height, mass, birth_year) %>%
correlate() %>%
Expand All @@ -110,7 +110,7 @@ starwars %>%
Correlations only provide a good description of the relationship if the relationship is (roughly) linear and there aren't severe outliers that are wielding too strong of an influence on the results. So it is always a good idea to visualize the correlations as well as to quantify them. The `base::pairs()` function does this. The first argument to `pairs()` is simply of the form `~ v1 + v2 + v3 + ... + vn` where `v1`, `v2`, etc. are the names of the variables you want to correlate.


```r
``` r
pairs(~ height + mass + birth_year, starwars)
```

Expand All @@ -122,7 +122,7 @@ pairs(~ height + mass + birth_year, starwars)
We can see that there is a big outlier influencing our data; in particular, there is a creature with a mass greater than 1200kg! Let's find out who this is and eliminate them from the dataset.


```r
``` r
starwars %>%
filter(mass > 1200) %>%
select(name, mass, height, birth_year)
Expand All @@ -138,7 +138,7 @@ starwars %>%
OK, let's see how the data look without this massive creature.


```r
``` r
starwars2 <- starwars %>%
filter(name != "Jabba Desilijic Tiure")

Expand All @@ -153,7 +153,7 @@ pairs(~height + mass + birth_year, starwars2)
Better, but there's a creature with an outlying birth year that we might want to get rid of.


```r
``` r
starwars2 %>%
filter(birth_year > 800) %>%
select(name, height, mass, birth_year)
Expand All @@ -169,7 +169,7 @@ starwars2 %>%
It's Yoda. He's as old as the universe. Let's drop him and see how the plots look.


```r
``` r
starwars3 <- starwars2 %>%
filter(name != "Yoda")

Expand All @@ -184,7 +184,7 @@ pairs(~height + mass + birth_year, starwars3)
That looks much better. Let's see how that changes our correlation matrix.


```r
``` r
starwars3 %>%
select(height, mass, birth_year) %>%
correlate() %>%
Expand All @@ -210,7 +210,7 @@ Note that these values are quite different from the ones we started with.
Sometimes it's not a great idea to remove outliers. Another approach to dealing with outliers is to use a robust method. The default correlation coefficient that is computed by `corrr::correlate()` is the Pearson product-moment correlation coefficient. You can also compute the Spearman correlation coefficient by changing the `method()` argument to `correlate()`. This replaces the values with ranks before computing the correlation, so that outliers will still be included, but will have dramatically less influence.


```r
``` r
starwars %>%
select(height, mass, birth_year) %>%
correlate(method = "spearman") %>%
Expand All @@ -234,7 +234,7 @@ starwars %>%
Incidentally, if you are generating a report from R Markdown and want your tables to be nicely formatted you can use `knitr::kable()`.


```r
``` r
starwars %>%
select(height, mass, birth_year) %>%
correlate(method = "spearman") %>%
Expand All @@ -243,36 +243,13 @@ starwars %>%
knitr::kable()
```

<table>
<thead>
<tr>
<th style="text-align:left;"> term </th>
<th style="text-align:left;"> height </th>
<th style="text-align:left;"> mass </th>
<th style="text-align:left;"> birth_year </th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align:left;"> height </td>
<td style="text-align:left;"> </td>
<td style="text-align:left;"> </td>
<td style="text-align:left;"> </td>
</tr>
<tr>
<td style="text-align:left;"> mass </td>
<td style="text-align:left;"> .72 </td>
<td style="text-align:left;"> </td>
<td style="text-align:left;"> </td>
</tr>
<tr>
<td style="text-align:left;"> birth_year </td>
<td style="text-align:left;"> .15 </td>
<td style="text-align:left;"> .15 </td>
<td style="text-align:left;"> </td>
</tr>
</tbody>
</table>


|term |height |mass |birth_year |
|:----------|:------|:----|:----------|
|height | | | |
|mass |.72 | | |
|birth_year |.15 |.15 | |

## Simulating bivariate data

Expand Down Expand Up @@ -350,7 +327,7 @@ Let's start by simulating data representing hypothetical humans and their height
I found some data [here](https://www.geogebra.org/m/RRprACv4) which I converted into a CSV file. If you want to follow along, download the file [heights_and_weights.csv](data/heights_and_weights.csv){download="heights_and_weights.csv"}. Here's how the scatterplot looks:


```r
``` r
handw <- read_csv("data/heights_and_weights.csv", col_types = "dd")

ggplot(handw, aes(height_in, weight_lbs)) +
Expand All @@ -366,7 +343,7 @@ ggplot(handw, aes(height_in, weight_lbs)) +
Now, that's not quite a linear relationship. We can make it into one by log transforming each of the variables first.


```r
``` r
handw_log <- handw %>%
mutate(hlog = log(height_in),
wlog = log(weight_lbs))
Expand Down Expand Up @@ -414,14 +391,14 @@ OK, how do we form `Sigma` in R so that we can pass it to the `mvrnorm()` functi
First let's define our covariance and store it in the variable `my_cov`.


```r
``` r
my_cov <- .96 * .26 * .65
```

Now we'll use `matrix()` to define our `Sigma`, `my_Sigma`.


```r
``` r
my_Sigma <- matrix(c(.26^2, my_cov, my_cov, .65^2), ncol = 2)
my_Sigma
```
Expand All @@ -444,7 +421,7 @@ my_Sigma
Great. Now that we've got `my_Sigma`, we're ready to use `MASS::mvrnorm()`. Let's test it out by creating 6 synthetic humans.


```r
``` r
set.seed(62) # for reproducibility

# passing the *named* vector c(height = 4.11, weight = 4.74)
Expand All @@ -469,7 +446,7 @@ log_ht_wt
So `MASS::mvrnorm()` returns a matrix with a row for each simulated human, with the first column representing the log height and the second column representing the log weight. But log heights and log weights are not very useful to us, so let's transform them back by using `exp()`, which is the inverse of the `log()` transform.


```r
``` r
exp(log_ht_wt)
```

Expand All @@ -488,7 +465,7 @@ So our first simulated human is 70.4 inches tall (about 5'5" or X) and weighs 19
OK, let's randomly generate a bunch of humans, transform them from log to inches and pounds, and plot them against our original data to see how we're doing.


```r
``` r
## simulate new humans
new_humans <- MASS::mvrnorm(500,
c(height_in = 4.11, weight_lbs = 4.74),
Expand Down Expand Up @@ -534,7 +511,7 @@ Given the estimates above for log height and weight, can you solve for $\beta_1$
<!-- TODO make this use webex -->


```r
``` r
b1 <- .96 * (.65 / .26)
b1
```
Expand Down Expand Up @@ -563,7 +540,7 @@ $$Y_i = -5.124 + 2.4X_i + e_i.$$
To check our results, let's first run a regression on the log-transformed data using `lm()`, which estimates parameters using ordinary least squares regression.


```r
``` r
summary(lm(wlog ~ hlog, handw_log))
```

Expand Down Expand Up @@ -593,7 +570,7 @@ Looks pretty close. The reason that it doesn't match exactly is only because we'
As another check, let's superimpose the regression line we computed by hand on the scatterplot of the log-transformed data.


```r
``` r
ggplot(handw_log, aes(hlog, wlog)) +
geom_point(alpha = .2) +
labs(x = "log(height)", y = "log(weight)") +
Expand All @@ -616,5 +593,4 @@ To close, here are a few implications from the relationship between correlation

## Exercises

<iframe src="https://rstudio-connect.psy.gla.ac.uk/covariance/?showcase=0" width="530px" height="480px" data-external="1"></iframe>

<iframe src="https://talklab.psy.gla.ac.uk/app/covariance-site/?showcase=0" width="530px" height="480px" data-external="1"></iframe>
Binary file modified docs/02-correlation-regression_files/figure-html/bye-yoda-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/02-correlation-regression_files/figure-html/handw-log-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/02-correlation-regression_files/figure-html/pairs-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 2ae1980

Please sign in to comment.