blake_exercise.qmd

---
title: 'Blake et al. (2015) exercise'
author: "Mike Frank"
date: "November 15, 2019"
format: 
  html:
    toc: true
---

# Intro

This is an in-class exercise exploring Blake et al. (2015), [Ontogeny of fairness in seven societies](http://www.nature.com/nature/journal/v528/n7581/full/nature15703.html), *Nature*.

Please explore these data together (without looking at the analyses supplied by the authors). 

The overall goal is to understand the degree to which the data support the authors' hypotheses, and to make some more awesome plots along the way.

```{r}
library(tidyverse)

# two helper functions
sem <- function(x) {sd(x, na.rm = TRUE) / sqrt(sum(!is.na(x)))}
ci95 <- function(x) {sem(x) * 1.96} # lazy normal approximation
```

# Data Prep

First read in the data, as distributed by the journal. 

```{r}
d <- read_csv("data/Ontogeny_fairness_seven_societies_data.csv", 
              na = c("NA", ".")) # they use . to indicate NA
```

Do some preprocessing, taken directly from the supplemental material. 

```{r}
facVars <- c("eq.uneq", "value", "decision")
d[, facVars] <- lapply(d[, facVars], factor)
d$trial.number <- as.numeric(gsub(".(\\d+)", "\\1", d$trial))
```

Rename things so that they are easy to deal with. I hate hard to remember abbreviations for condition names. 

```{r}
d$trial_type <- factor(d$eq.uneq, 
                       levels = c("E","U"), 
                       labels = c("Equal","Unequal"))
d$condition <- factor(d$condition,
                      levels = c("AI","DI"), 
                      labels = c("Advantageous","Disadvantageous"))
```

# Variable exploration

Describe the dataset graphically in ways that are useful for you to get a handle on the data collection effort. 

Histograms are good. Ages of the participants are useful too. 

Remember your `group_by` + `summarise` workflow. This will help you here.

```{r}
```

Make sure you understand what the design was: how many trials per participant, what was between- and within-subject, etc. 

# Hypothesis-related exploration

In this second, explore the authors' hypotheses related to advantageous and inadvantageous inequity aversion. Create 1 - 3 pictures that describe the support (or lack of it) for this hypothesis. 

```{r}
ggplot(filter(d, !is.na(decision)), 
       aes(x = actor.age.years, y = as.numeric(decision == "reject"), col = country)) + 
  geom_jitter(alpha = .1, height = .05, width = .05) + 
  facet_grid(trial_type ~ condition) + 
  geom_smooth(method = "lm")
```


```{r}
ms <- d %>%
    filter(!is.na(eq.uneq)) %>%
    mutate(
        age = actor.age.years,
        decision = decision == "reject"
    ) %>%
    group_by(country, trial_type, condition, age, actor.id) %>%
    summarise(mean = mean(decision, na.rm = TRUE))

ms$country <- factor(ms$country,
    levels = c(
        "Uganda", "US", "Canada",
        "Senegal", "India", "Peru", "Mexico"
    )
)

ggplot(ms, aes(x = age, y = mean, col = condition)) +
    geom_jitter(alpha = .3, width = 0, height = .02) +
    facet_grid(trial_type ~ country) +
    ylab("Proportion offers rejected") +
    xlab("Age (years)") +
    ylim(c(0, 1)) +
    langcog::scale_colour_solarized() +
    theme_bw() +
    theme(legend.position = "bottom")
```