This repository has been archived by the owner on Aug 30, 2021. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 4
/
Copy pathCS_09.Rmd
197 lines (164 loc) · 6.63 KB
/
CS_09.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
---
title: Tracking Hurricanes!
subtitle: Analyze historical storm data from the NOAA API
week: 9
type: Case Study
reading:
- Overview of the [International Best Track Archive for Climate Stewardship (IBTrACS)](https://www.ncdc.noaa.gov/ibtracs/index.php?name=ibtracs-data-access)
- Performing [Spatial Joins with sf](https://r-spatial.github.io/sf/reference/st_join.html)
tasks:
- Write a .Rmd script to perform the following tasks
- Use an API to access NOAA Storm data over the web
- Intersect the storms with US states to quantify how many storms in the database have hit each state.
---
```{r, echo=FALSE, message=FALSE, results='hide', purl=FALSE}
source("functions.R")
source("knitr_header.R")
```
# Reading
```{r reading,results='asis',echo=F}
md_bullet(rmarkdown::metadata$reading)
```
# Tasks
```{r tasks,results='asis',echo=F}
md_bullet(rmarkdown::metadata$tasks)
```
## Libraries & Data
```{r message=F,warning=FALSE}
library(sf)
library(tidyverse)
library(ggmap)
library(rnoaa)
library(spData)
data(world)
data(us_states)
```
## Objective
In this case study you will download storm track data from NOAA, make a summary plot, and quantify how many storms have hit each of the United States. This will require you to use a spatial join (`st_join`).
### Your goal
```{r download_data, eval=F, echo=T, purl=F, warning=F}
# Doesn't work in October 2020.
storms <- storm_shp(basin = "NA")
storm_data <- read_sf(storms$path)
```
```{r download_data2, eval=T, echo=T, purl=F, warning=F}
# 2020 update - it appears NOAA changed the URL which broke the R function. Use the following instead of storm_shp().
dataurl="https://www.ncei.noaa.gov/data/international-best-track-archive-for-climate-stewardship-ibtracs/v04r00/access/shapefile/IBTrACS.NA.list.v04r00.points.zip"
tdir=tempdir()
download.file(dataurl,destfile=file.path(tdir,"temp.zip"))
unzip(file.path(tdir,"temp.zip"),exdir = tdir)
list.files(tdir)
storm_data <- read_sf(list.files(tdir,pattern=".shp",full.names = T))
```
```{r, echo=F, purl=F, warning=F}
# function to drop storm points not near main tracks
drop_outliers <- function(x) {
central_point=st_union(x) %>%
st_centroid()
dists=st_distance(x,central_point) #get distances from central point
x2=filter(x,as.numeric(dists)<100000) #keep only points within 100km of centroid
return(x) #return filtered dataset
}
storms <- storm_data%>%
filter(year>=1950)%>%
drop_outliers() %>%
mutate_if(is.numeric,
function(x) ifelse(x==-999.0,NA,x))%>% #-999 to NA
mutate(decade=floor(year/10)*10) #add decade
region=st_bbox(storms)
```
Your desired figure looks something like the following:
```{r, echo=F, purl=F, warning=F, fig.width=15, fig.height=15}
map1=ggplot() +
geom_sf(data=world,
inherit.aes = F,size=.1,
fill="grey",colour="black")+
facet_wrap(~decade)+
stat_bin2d(data=storms,
aes(y=st_coordinates(storms)[,2],
x=st_coordinates(storms)[,1]),bins=100)+
scale_fill_distiller(palette="YlOrRd",
trans="log",
direction=-1,
breaks = c(1,10,100,1000))+
coord_sf(ylim=region[c(2,4)],
xlim=region[c(1,3)])+
labs(x="",y="")
map1
```
Calculate a table of the five states that have experienced the most storms.
```{r, echo=F, purl=F, warning=F, message=F}
library(knitr);library(kableExtra)
states<- st_transform(us_states,st_crs(storms)) %>%
dplyr::select(state=NAME)
storm_states <- st_join(storms, states,
join = st_intersects,
left = F)
storm_states%>%
group_by(state)%>%
summarize(storms=length(unique(NAME)))%>%
st_set_geometry(NULL)%>%
arrange(desc(storms))%>%
slice(1:5)%>%
kable()%>%
kable_styling()
```
<div class="well">
<button data-toggle="collapse" class="btn btn-primary btn-sm round" data-target="#demo1">Show Hints</button>
<div id="demo1" class="collapse">
## Steps
1. Use the API to Download storm data
* Use `storm_shp()` for `basin = "NA"`
* Read the points in with `storm_shp_read()`
* Convert to `sf` format with `st_as_sf()`
2. Wrangle the data
* Filter to storms 1950-present with `filter()`
* Use `mutate_if()` to convert `-999.0` to `NA` in all numeric columns with the following command from the `dplyr` package: `mutate_if(is.numeric,` `function(x) ifelse(x==-999.0,NA,x))`
* Use the following command to add a column for decade: `mutate(decade=(floor(year/10)*10))`
* Use `st_bbox()` to identify the bounding box of the storm data and save this as an object called `region`.
3. Make the first plot
* Use `ggplot()` to plot the `world` polygon layer and add the following:
* add `facet_wrap(~decade)` to create a panel for each decade
* add `stat_bin2d(data=storms,` `aes(y=st_coordinates(storms)[,2],` `x=st_coordinates(storms)[,1]),bins=100)`
* use
`scale_fill_distiller(palette="YlOrRd",`
`trans="log",`
`direction=-1,`
`breaks = c(1,10,100,1000))` to set the color ramp
* use `coord_sf(ylim=region[c(2,4)],`
`xlim=region[c(1,3)])` to crop the plot to the region.
4. Calculate table of the five states with most storms.
* use `st_transform` to reproject `us_states` to the reference system of the `storms` object (you can extract a CRS from a sf object with `st_crs(storms)`
* Rename the `NAME` column in the state data to `state` to avoid confusion with storm name using `select(state=NAME)`
* Perform a spatial join between the storm database and the states object with: `storm_states <- st_join(storms, states, `
`join = st_intersects,left = F)`. This will 'add` the state to any storm that was recorded within that state.
* Use `group_by(state)` to group the next step by US state
* use `summarize(storms=length(unique(NAME)))` to count how many unique storms occurred in each state.
* use `arrange(desc(storms))` to sort by the number of storms in each state
* use `slice(1:5)` to keep only the top 5 states
```
</div>
</div>
---
<div class="extraswell">
<button data-toggle="collapse" class="btn btn-link" data-target="#extras">
Extra time? Try this...
</button>
<div id="extras" class="collapse">
Try to replicate the following graphic using the data you transformed above.
```{r, echo=F, purl=F, fig.height=10,fig.width=10}
storm_states_year <- storm_states%>%
group_by(state,year)%>%
summarize(storms=length(unique(NAME)))
ggplot(storm_states_year,
# aes(y=fct_reorder(state, storms),
aes(y=state,
x=year,
fill=storms))+
geom_tile()+
scale_fill_viridis_c(name="# Storms")+
ylab("State")
```
Can you sort the rows (states) in order of storm frequency (instead of alphabetical)?
</div>
</div>