-
Notifications
You must be signed in to change notification settings - Fork 1
/
00_extract_e0.Rmd
118 lines (80 loc) · 2.18 KB
/
00_extract_e0.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
---
title: "Data management"
output: html_notebook
---
# Intro
This script will aim to set up the project to use tidyverse packages, and use the hfdhmdplus library to download requisite data. To start with only life expectancy estimates will be required
# Load packages
n.b. `pacman` has been installed already
```{r}
pacman::p_load(
tidyverse,
HMDHFDplus
)
```
# HMD
Which countries are available?
```{r}
all_countries <- getHMDcountries()
```
For each country, want to know which items are available
But first the username and password need to be identified
```{r}
my_username <- userInput()
```
```{r}
my_password <- userInput()
```
```{r}
country_availability <- tibble(
code = all_countries
) %>%
mutate(
available_items = map(.x = code, .f = getHMDitemavail,
username = my_username, password = my_password
),
contains_e0 = map_lgl(available_items, ~`%in%`("E0per", .x))
)
```
```{r}
country_availability
```
All countries have the item `E0per`.
Now we want to extract E0per for each country
```{r}
e0_data <- tibble(
code = all_countries
) %>%
mutate(
e0_df = map(code, readHMDweb, item = "E0per", username = my_username, password = my_password)
) %>%
unnest() %>%
gather(key = "sex", value = "e0", Female:Total) %>%
mutate(sex = tolower(sex)) %>%
rename(year = Year)
```
Now to write this out
```{r}
write_rds(e0_data, path = "tidy_data/e0_period.rds")
```
# Extract age-specific mortality rates too
The main things to work out are whether there have been similar trends in infancy, young adulthood, and elderly trends in many countries.
Perhaps it would be good to get correlations between trends for these different age groups?
I think we should extract `Mx_1x1`
```{r}
Mx_data <- tibble(
code = all_countries
) %>%
mutate(
Mx_df = map(code, readHMDweb, item = "Mx_1x1", username = my_username, password = my_password)
) %>%
unnest() %>%
select(-OpenInterval) %>%
gather(key = "sex", value = "Mx", Female:Total) %>%
mutate(sex = tolower(sex)) %>%
rename(year = Year, age = Age)
```
Now write this
```{r}
write_rds(Mx_data, "tidy_data/Mx_data.rds")
```