-
Notifications
You must be signed in to change notification settings - Fork 110
/
blake_exercise.qmd
107 lines (80 loc) · 3.16 KB
/
blake_exercise.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
---
title: 'Blake et al. (2015) exercise'
author: "Mike Frank"
date: "November 15, 2019"
format:
html:
toc: true
---
# Intro
This is an in-class exercise exploring Blake et al. (2015), [Ontogeny of fairness in seven societies](http://www.nature.com/nature/journal/v528/n7581/full/nature15703.html), *Nature*.
Please explore these data together (without looking at the analyses supplied by the authors).
The overall goal is to understand the degree to which the data support the authors' hypotheses, and to make some more awesome plots along the way.
```{r}
library(tidyverse)
# two helper functions
sem <- function(x) {sd(x, na.rm = TRUE) / sqrt(sum(!is.na(x)))}
ci95 <- function(x) {sem(x) * 1.96} # lazy normal approximation
```
# Data Prep
First read in the data, as distributed by the journal.
```{r}
d <- read_csv("data/Ontogeny_fairness_seven_societies_data.csv",
na = c("NA", ".")) # they use . to indicate NA
```
Do some preprocessing, taken directly from the supplemental material.
```{r}
facVars <- c("eq.uneq", "value", "decision")
d[, facVars] <- lapply(d[, facVars], factor)
d$trial.number <- as.numeric(gsub(".(\\d+)", "\\1", d$trial))
```
Rename things so that they are easy to deal with. I hate hard to remember abbreviations for condition names.
```{r}
d$trial_type <- factor(d$eq.uneq,
levels = c("E","U"),
labels = c("Equal","Unequal"))
d$condition <- factor(d$condition,
levels = c("AI","DI"),
labels = c("Advantageous","Disadvantageous"))
```
# Variable exploration
Describe the dataset graphically in ways that are useful for you to get a handle on the data collection effort.
Histograms are good. Ages of the participants are useful too.
Remember your `group_by` + `summarise` workflow. This will help you here.
```{r}
```
Make sure you understand what the design was: how many trials per participant, what was between- and within-subject, etc.
# Hypothesis-related exploration
In this second, explore the authors' hypotheses related to advantageous and inadvantageous inequity aversion. Create 1 - 3 pictures that describe the support (or lack of it) for this hypothesis.
```{r}
ggplot(filter(d, !is.na(decision)),
aes(x = actor.age.years, y = as.numeric(decision == "reject"), col = country)) +
geom_jitter(alpha = .1, height = .05, width = .05) +
facet_grid(trial_type ~ condition) +
geom_smooth(method = "lm")
```
```{r}
ms <- d %>%
filter(!is.na(eq.uneq)) %>%
mutate(
age = actor.age.years,
decision = decision == "reject"
) %>%
group_by(country, trial_type, condition, age, actor.id) %>%
summarise(mean = mean(decision, na.rm = TRUE))
ms$country <- factor(ms$country,
levels = c(
"Uganda", "US", "Canada",
"Senegal", "India", "Peru", "Mexico"
)
)
ggplot(ms, aes(x = age, y = mean, col = condition)) +
geom_jitter(alpha = .3, width = 0, height = .02) +
facet_grid(trial_type ~ country) +
ylab("Proportion offers rejected") +
xlab("Age (years)") +
ylim(c(0, 1)) +
langcog::scale_colour_solarized() +
theme_bw() +
theme(legend.position = "bottom")
```