-
Notifications
You must be signed in to change notification settings - Fork 10
/
lec07.Rmd
654 lines (507 loc) · 24.6 KB
/
lec07.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
---
title: "Geospatial analysis in R"
output:
html_document:
fig_caption: no
number_sections: yes
toc: yes
toc_float: false
collapsed: no
---
```{r set-options, echo=FALSE}
options(width = 105)
knitr::opts_chunk$set(dev='png', dpi=300, cache=FALSE)
pdf.options(useDingbats = TRUE)
klippy::klippy(position = c('top', 'right'))
```
<p><span style="color: #00cc00;">NOTE: This page has been revised for Winter 2021, but may undergo further edits.</span></p>
# Introduction #
The `sf` package (and `sp` before it), with it's direct links to the `GDAL`, `GEOS`, and `PROJ` libraries give **R** a powerful geospatial analysis capability, and at this point in the development of packages and methods for handling spatial, practically anything that can be accomplished in commercial or open-source GIS software can be done in **R**. Because **R** is script-oriented, this has the added benefit of reproducibility, inasmuch as scripts can be saved and reused, while analysis done in menu-focused GUIs are not.
The examples here cover some of the typical things one might want to do in a GIS package, with an emphasis on tasks arising in physical geography.
# Extract data from a raster #
This example demonstrates how to extract data from a raster for a set of target points contained in a .csv file. In this case, the raster is the Ramankutty and Foley potential natural vegetation data set ([https://www.nelson.wisc.edu/sage/data-and-models/global-land-use/index.php](Shttps://www.nelson.wisc.edu/sage/data-and-models/global-land-use/index.php)), and the target points are the Global Charcoal Database (GCDv3) site locations ([http://www.gpwg.paleofire.org](http://www.gpwg.paleofire.org)), with the task being to "look up" the vegetation at each site.
Load some packages:
```{r begin extract, results="hide", message=FALSE}
# load packages
library(raster)
library(rasterVis)
library(RColorBrewer)
library(sf)
```
## Read the data sets -- source and target ##
User the `raster` package to read the vegetation data, which is stored in a "netCDF" file, a self-documenting, machine-independent format that a lot of Earth-system science data (like climate-model output, or satellite remote-sensing data) are stored in. Note that the `ncdf4` package is automatically loaded if the source file is a netCDF file.
```{r read veg}
# read potential natural vegetation data sage_veg30.nc:
# modify the following path to reflect local files
vegtype_path <- "/Users/bartlein/Documents/geog495/data/nc_files/"
vegtype_name <- "sage_veg30.nc"
vegtype_file <- paste(vegtype_path, vegtype_name, sep="")
vegtype <- raster(vegtype_file, varname="vegtype")
vegtype
```
Plot the vegetation data using a `rasterVis` levelplot:
```{r plot veg}
mapTheme <- rasterTheme(region=rev(brewer.pal(8,"Greens")))
levelplot(vegtype, margin=FALSE, par.settings=mapTheme,
main="Potential Natural Vegetation")
```
Read the charcoal data locations:
```{r read charcoal}
# read GCDv3 sites
# modify the following path to reflect local files
csv_path <- "/Users/bartlein/Documents/geog495/data/csv/"
csv_name <- "v3i_nsa_globe.csv"
csv_file <- paste(csv_path, csv_name, sep="")
gcdv3 <- read.csv(csv_file)
str(gcdv3)
plot(gcdv3$Lon, gcdv3$Lat, pch=16, cex=0.5, col="blue")
```
In order to use the `extract()` function from `raster`, the target points must be turned into a `sf` object data set.
```{r sf1}
# turn into an sf object
gcdv3_sf <- st_as_sf(gcdv3, coords = c("Lon", "Lat"))
class(gcdv3_sf)
gcdv3_sf
```
Add the CRS:
```{r sf2}
st_crs(gcdv3_sf) <- st_crs("+proj=longlat")
```
Plot the target points on top of ths source map. The `as_Spatial()` function converts an `sf` object to a `sp` spatial-data type on the fly.
```{r plot both}
plt <- levelplot(vegtype, margin=FALSE, par.settings=mapTheme,
main="Potential Natural Vegetation")
plt + layer(sp.points(as_Spatial(gcdv3_sf), col="blue", pch=16, cex=0.5))
```
## Extract data at target points ##
Now extract the data for the target points:
```{r extract}
# extract data from the raster at the target points
gcdv3_vegtype <- extract(vegtype, gcdv3_sf, method="simple")
class(gcdv3_vegtype)
head(gcdv3_vegtype)
```
Make a dataframe of the extracted data that could be saved as a .csv file, and plot it:
```{r make dataframe}
pts <- data.frame(gcdv3$Lon, gcdv3$Lat, gcdv3_vegtype)
names(pts) <- c("Lon", "Lat", "vegtype")
head(pts, 10)
plotclr <- rev(brewer.pal(8,"Greens"))
plotclr <- c("#AAAAAA", plotclr)
cutpts <- c(0, 2, 4, 6, 8, 10, 12, 14, 16)
color_class <- findInterval(gcdv3_vegtype, cutpts)
plot(pts$Lon, pts$Lat, col=plotclr[color_class+1], pch=16)
```
Plot the extracted data at the target points on top of the source points. If the extraction is successful, the target-point colors should dissappear against the background.
```{r plot over}
plt <- levelplot(vegtype, margin=FALSE, par.settings=mapTheme,
main="Potential Natural Vegetation")
plotclr <- rev(brewer.pal(8,"Greens"))
cutpts <- c(0, 2, 4, 6, 8, 10, 12, 14, 16)
color_class <- findInterval(gcdv3_vegtype, cutpts)
plt + layer(sp.points(as_Spatial(gcdv3_sf), col=plotclr[color_class], pch=16, cex=0.6)) +
layer(sp.points(as_Spatial(gcdv3_sf), col="black", pch=1, cex=0.6))
```
Looks ok.
[[Back to top]](lec07.html)
## A second example -- explicit cell selection ##
Here's a second example of extracting values from an array, by referencing the specific cell or array element that a target point falls in. In this example, a netCDF file of bioclimatic variables is read using `ncdf4` and the values of `mtco` (mean temperature of the coldest month) are extracted.
```{r library2, cache=FALSE}
library(ncdf4)
library(lattice)
library(classInt)
library(RColorBrewer)
```
```{r ncdf4 read mtco, echo=TRUE, eval=TRUE, cache=FALSE}
# set path and filename
# modify the following path to reflect local files
ncpath <- "/Users/bartlein/Documents/geog495/data/nc_files/"
ncname <- "cru10min30_bio.nc"
ncfname <- paste(ncpath, ncname, sep="")
```
```{r open mtco, echo=TRUE, eval=TRUE, cache=FALSE}
# open a netCDF file
ncin <- nc_open(ncfname)
print(ncin)
```
```{r ncdf4 mtco}
# get longitude and latitude
lon <- ncvar_get(ncin,"lon")
nlon <- dim(lon)
head(lon)
lat <- ncvar_get(ncin,"lat")
nlat <- dim(lat)
head(lat)
print(c(nlon,nlat))
```
```{r ncdf4 get data}
# get the mtco data
mtco <- ncvar_get(ncin,"mtco")
dlname <- ncatt_get(ncin,"mtco","long_name")
dunits <- ncatt_get(ncin,"mtco","units")
fillvalue <- ncatt_get(ncin,"mtco","_FillValue")
dim(mtco)
```
```{r recode missing mtco}
mtco[mtco==fillvalue$value] <- NA
```
Close the netCDF file using the `nc_close()` function.
```{r ncdf4 close}
# close the netCDF file
nc_close(ncin)
```
Plot the control data.
```{r levelplot1, fig.width=5, fig.height=3}
# levelplot of the slice
grid <- expand.grid(lon=lon, lat=lat)
cutpts <- c(-50,-40,-30,-20,-10,0,10,20,30,40,50)
levelplot(mtco ~ lon * lat, data=grid, at=cutpts, cuts=11, pretty=TRUE,
col.regions=(rev(brewer.pal(10,"RdBu"))))
```
Now extract the data for the target points:
Get the indices (j's and k's) of the grid cell that each target point lies in. For each target point, figure out which column (`j`) and row (`k`) a target point falls in. This code is basically the same as that used in reshaping a "short" data frame into an array. The function that is defined and executed within the `sapply()` function figures out which column (`j`) and row ('k') in the control-data array a target point falls in. The `j`'s and `k`'s together describe the control-data grid cells the individual target points fall in.
```{r get jk}
j <- sapply(gcdv3$Lon, function(x) which.min(abs(lon-x)))
k <- sapply(gcdv3$Lat, function(x) which.min(abs(lat-x)))
head(cbind(j,k)); tail(cbind(j,k))
```
Get the data for each j, k combination. The way to do this is the convert the `mtco` array to a vector, and then calculate an index `jk` for each target value:
```{r get data}
mtco_vec <- as.vector(mtco)
jk <- (k-1)*nlon + j
gcdv3_mtco <- mtco_vec[jk]
head(cbind(j,k,jk,gcdv3_mtco,lon[j],lat[k]))
```
```{r make dataframe mtco}
gcdv3_mtco[is.na(gcdv3_mtco)] <- -99
pts <- data.frame(gcdv3$Lon, gcdv3$Lat, gcdv3_mtco)
names(pts) <- c("Lon", "Lat", "mtco")
head(pts, 20)
```
Plot the extracted values of `mtco`. To do this, the colors for plotting the different levels of `mtco` are generated from and `RColorBrewer` palette, and augmented by gray (`"#AAAAAA"`) to handle missing values (i.e. from marine charcoal records, or those from sites "off" the cru10min30 grid).
```{r plot mtco}
plotclr <- rev(brewer.pal(10,"RdBu"))
plotclr <- c("#AAAAAA", plotclr)
cutpts <- c(-50,-40,-30,-20,-10,0,10,20,30,40,50)
color_class <- findInterval(gcdv3_mtco, cutpts)
plot(gcdv3$Lon, gcdv3$Lat, col=plotclr[color_class+1], pch=16)
```
[[Back to top]](lec07.html)
# Clipping/Trimming/Point-in-polygon analyses #
A common problem arises in dealing with spatial data is the "clipping" or "trimming" problem, in which one wants to know whether one or more points lie within the outlines of a particular polygon, or in the case of multiple polygons, which polygon an individual point lies in. More formally, this is known as the "point in polygon" problem. Here the process of clipping the na_10km_v2 climate grid to a tree-species shape file, in particular, that for *Picea mariana* (black spruce) is demonstrated. What we want is a list of climate data set points that lie withing the range of *P. mariana*.
## Read the polygon and target-point files ##
Read the shapefile for *Picea mariana*
```{r read picae}
# read the shape file for Picea Mariana
# modify the following path to reflect local files
shp_file <- "/Users/bartlein/Documents/geog495/data/shp/picemari.shp"
picea_sf <- st_read(shp_file)
picea_sf
```
Plot the outline.
```{r plot picea}
plot(st_geometry(picea_sf))
```
Read the na10km_v2 points as a .csv file (for illustration, in practice it would be more efficient to read it as a netCDF file). This file contains the grid-point locations of a present-day climate data set, and we want a list of the points that fall within the range limits of Picea mariana.
```{r read na10km_v2}
# read the na10km_v2 points (as a .csv file)
csv_file <- "/Users/bartlein/Documents/geog495/data/csv/na10km_v2.csv"
na10km_v2 <- read.csv(csv_file)
str(na10km_v2)
```
We don't need data from east of 45 W, so trim the data set.
```{r }
na10km_v2 <- na10km_v2[na10km_v2$lon <= -45.0, ]
str(na10km_v2)
```
Make an `sf` object of the na10km_v2 data-point locations, and plot it:
```{r na10km dataframe}
# make an sf object
na10km_v2_sf <- st_as_sf(na10km_v2, coords = c("lon", "lat"))
st_crs(na10km_v2_sf) <- st_crs("+proj=longlat")
na10km_v2_sf
plot(st_geometry(na10km_v2_sf), pch=16, cex=0.2) # takes a little while
```
## Overlay the points onto the polygons ##
Now overlay the points onto the polygons. The `st_join()` function can take a little while to run.
```{r over}
# overlay the two
st_crs(picea_sf) <- st_crs(na10km_v2_sf) # make sure CRS's exactly match
picea_pts_sf <- st_join(na10km_v2_sf, picea_sf)
picea_pts_sf <- na.omit(picea_pts_sf)
picea_pts_sf
```
The warning `although coordinates are longitude/latitude, st_intersects assumes that they are planar` arises because in fact, longitude and latitude are anisotropic, and it would be better to do this with projected data with isotropic coordinates (i.e. x- and y- coordinates in metres.)
```{r}
plot(st_geometry(na10km_v2_sf), pch=16, cex=0.2, axes=TRUE)
plot(st_geometry(picea_pts_sf), pch=16, cex=0.2, col="green", add=TRUE)
```
[[Back to top]](lec07.html)
# Gridding or rasterizing point data #
Another common task is to grid or rasterize a set of point data, creating a gridded data set of counts, averages, minima, maxima, etc. This can be illustrated using the FPA-FOD daily fire-fire start data set: Spatial wildfire occurrence data for the United States, 1992-2013/Fire Program Analysis Fire-Occurrence Database [FPA_FOD_20150323] (3nd Edition) (Short, K.C., 2014, Earth Syst. Sci. Data, 6, 1-27) – [http://www.fs.usda.gov/rds/archive/Product/RDS-2013-0009.3/](http://www.fs.usda.gov/rds/archive/Product/RDS-2013-0009.3/). The idea here is to calculate the number and average area of the fires that occured in 0.5-degree grid cells that cover the coterminous U.S.
The analysis here uses an approach from the `sp` package.
## Read the data sets, and create empty rasters ##
```{r rasterize begin}
library(raster)
library(rasterVis)
```
Read the fire-start data:
```{r read fire}
# read the FPA-FOD fire-start data
# modify the following path to reflect local files
csv_path <- "/Users/bartlein/Dropbox/DataVis/working/data/csv_files/"
csv_name <- "fpafod_1992-2013.csv"
csv_file <- paste(csv_path, csv_name, sep="")
fpafod <- read.csv(csv_file) # takes a while
str(fpafod)
```
Convert the fire-start data to a `SpatialPointsDataFrame`
```{r fire dataframe}
fpafod_coords <- cbind(fpafod$longitude, fpafod$latitude)
fpafod_pts <- SpatialPointsDataFrame(coords=fpafod_coords, data=data.frame(fpafod$area_ha))
names(fpafod_pts) <- "area_ha"
```
Create a couple of empty rasters to hold the rasterized data:
```{r create rasters}
# create (empty) rasters
cell_size <- 0.5
lon_min <- -128.0; lon_max <- -65.0; lat_min <- 25.5; lat_max <- 50.5
ncols <- ((lon_max - lon_min)/cell_size)+1; nrows <- ((lat_max - lat_min)/cell_size)+1
us_fire_counts <- raster(nrows=nrows, ncols=ncols, xmn=lon_min, xmx=lon_max, ymn=lat_min, ymx=lat_max, res=0.5, crs="+proj=longlat +datum=WGS84")
us_fire_counts
us_fire_area <- raster(nrows=nrows, ncols=ncols, xmn=lon_min, xmx=lon_max, ymn=lat_min, ymx=lat_max, res=0.5, crs="+proj=longlat +datum=WGS84")
us_fire_area
```
## Rasterize the data ##
Now rasterize the data, first as counts (number of fires in each grid cell), and then as the average size of the fires in each grid cell. Also plot the data (both on a log10 scale:
```{r rasterize}
# rasterize
us_fire_counts <- rasterize(fpafod_coords, us_fire_counts, fun="count")
us_fire_counts
plot(log10(us_fire_counts), col=brewer.pal(9,"BuPu"), sub="log10 Number of Fires")
us_fire_area <- rasterize(fpafod_pts, us_fire_area, fun=mean)
us_fire_area
plot(log10(us_fire_area$area_ha), col=brewer.pal(9,"YlOrRd"), sub="log10 Mean Area")
```
## Write the rasterized data out as a netCDF file ##
Write the two rasterized data sets out as variables in a netCDF data set. Create some variables, and replace the R `NAs` with netCDF fillvalues:
```{r netCDF setup}
# make necessary vectors and arrays
lon <- seq(lon_min+0.25, lon_max-0.25, by=cell_size)
lat <- seq(lat_max-0.25, lat_min+0.25, by=-1*cell_size)
print(c(length(lon), length(lat)))
fillvalue <- 1e32
us_fire_counts2 <- t(as.matrix(us_fire_counts$layer, nrows=ncols, ncols=nrows))
dim(us_fire_counts2)
us_fire_counts2[is.na(us_fire_counts2)] <- fillvalue
us_fire_area2 <- t(as.matrix(us_fire_area$area_ha, nrows=ncols, ncols=nrows))
dim(us_fire_area2)
us_fire_area2[is.na(us_fire_area2)] <- fillvalue
```
Write out a netCDF file:
```{r write netCDF}
# write out a netCDF file
library(ncdf4)
# path and file name, set dname
# modify the following path to reflect local files
ncpath <- "/Users/bartlein/Dropbox/DataVis/working/data/nc_files/"
ncname <- "us_fires.nc"
ncfname <- paste(ncpath, ncname, sep="")
# create and write the netCDF file -- ncdf4 version
# define dimensions
londim <- ncdim_def("lon","degrees_east",as.double(lon))
latdim <- ncdim_def("lat","degrees_north",as.double(lat))
# define variables
dname <- "fpafod_counts"
dlname <- "Number of fires, 1992-2013"
v1_def <- ncvar_def(dname,"1",list(londim,latdim),fillvalue,dlname,prec="single")
dname <- "fpafod_mean_area"
dlname <- "Average Fire Size, 1992-2013"
v2_def <- ncvar_def(dname,"ha",list(londim,latdim),fillvalue,dlname,prec="single")
# create netCDF file and put arrays
ncout <- nc_create(ncfname,list(v1_def, v2_def),force_v4=TRUE)
# put variables
ncvar_put(ncout,v1_def,us_fire_counts2)
ncvar_put(ncout,v2_def,us_fire_area2)
# put additional attributes into dimension and data variables
ncatt_put(ncout,"lon","axis","X")
ncatt_put(ncout,"lat","axis","Y")
# add global attributes
ncatt_put(ncout,0,"title","FPA-FOD Fires")
ncatt_put(ncout,0,"institution","USFS")
ncatt_put(ncout,0,"source","http://www.fs.usda.gov/rds/archive/Product/RDS-2013-0009.3/")
ncatt_put(ncout,0,"references", "Short, K.C., 2014, Earth Syst. Sci. Data, 6, 1-27")
history <- paste("P.J. Bartlein", date(), sep=", ")
ncatt_put(ncout,0,"history",history)
ncatt_put(ncout,0,"Conventions","CF-1.6")
# Get a summary of the created file:
ncout
# close the file, writing data to disk
nc_close(ncout)
```
[[Back to top]](lec07.html)
# Interpolating/regridding #
Another common task involves tranferring values from one gridded data set to another. When the grids are identical, this is trivial, but when the "target" grid is different from the source or "control" grid, this involves interpolation (as does also the case when the target points are irregularly distributed). The most widely used method for interpolation within a control grid is *bilinear interpolation*, which involves finding the control grid points that surround a particular target point, and then simultaneously (linearlly) interplating in the x- and y-directions. The method is implemented in the `raster` package, and relatively fast version is implemented in the `fields` package.
The example here uses a lower-resolution verions of the `ETOPO1` global DEM as the source file, and the locations of the (land-only) points in the na10km_v2 grid as the targets. These are read in here from a .csv file, but they also could have come from a netCDF file.
## Open the "control" netCDF and target .csv files ##
Load the appropriate packages.
```{r ncdf4 package, messages=FALSE, warning=FALSE, results='hide'}
# load the ncdf4 package
library(ncdf4)
library(lattice)
library(fields)
```
Read the etopo1 netCDF file. This particular file is one in which the original 30-sec data have been aggregated (by averaging) to six minutes or one-tenth of a degree (to speed up the execution of the examples). In practice, one would work with the original higher resolution data. Do some setup an open the file, and list its contents.
```{r ncdf4 openMac, echo=TRUE, eval=TRUE, cache=FALSE}
# set path and filename
# modify the following path to reflect local files
ncpath <- "/Users/bartlein/Dropbox/DataVis/working/data/nc_files/"
ncname <- "etopo1_ig_06min.nc"
ncfname <- paste(ncpath, ncname, sep="")
dname <- "elev"
```
```{r open etopo1, cache=FALSE}
# open a netCDF file
ncin <- nc_open(ncfname)
print(ncin)
```
Get the "control" longitudes and latitudes.
```{r etopo1 get lons and lats}
# get longitude and latitude
lon <- ncvar_get(ncin,"lon")
nlon <- dim(lon)
head(lon)
lat <- ncvar_get(ncin,"lat")
nlat <- dim(lat)
head(lat)
print(c(nlon,nlat))
```
Now read the elevations, and various attributes:
```{r get elev}
# get elevations
etopo1_array <- ncvar_get(ncin,dname)
dlname <- ncatt_get(ncin,dname,"long_name")
dunits <- ncatt_get(ncin,dname,"units")
fillvalue <- ncatt_get(ncin,dname,"_FillValue")
dsource <- ncatt_get(ncin, "elev", "source")
dim(etopo1_array)
```
```{r etopo1 get global attributes}
# get global attributes
title <- ncatt_get(ncin,0,"title")
institution <- ncatt_get(ncin,0,"institution")
datasource <- ncatt_get(ncin,0,"source")
history <- ncatt_get(ncin,0,"history")
Conventions <- ncatt_get(ncin,0,"Conventions")
```
Close the netCDF file using the `nc_close()` function.
```{r ncdf4 close etopo1}
# close the netCDF file
nc_close(ncin)
```
Produce a quick map to check that the data make sense. (*Always do this!*)
```{r levelplot elev, fig.width=5, fig.height=3, cache=TRUE}
# levelplot of elevations
grid <- expand.grid(lon=lon, lat=lat)
cutpts <- c(-7000, -6000, -4000, -2000, 0, 500, 1000, 1500, 2000, 3000, 4000, 5000)
levelplot(etopo1_array ~ lon * lat, data=grid, at=cutpts, cuts=11, pretty=TRUE,
col.regions=topo.colors(12))
```
Open and read the .csv file containing the "target"" points.
```{r read na10km_v2 target points}
# read na10km_v2 grid-point locations -- land-points only
# modify the following path to reflect local files
csv_path <- "/Users/bartlein/Dropbox/DataVis/working/data/csv_files/"
csv_name <- "na10km_v2_pts.csv"
csv_file <- paste(csv_path, csv_name, sep="")
na10km_v2 <- read.csv(csv_file)
str(na10km_v2)
```
Get the number of target points:
```{r get ntarg}
# get number of target points
ntarg <- dim(na10km_v2)[1]
ntarg
```
## Interpolation ##
Set up to do the interpolation. Generate x's and y's for the target grid. In practice, these might be read in from a netCDF file. Here we need them to define an array that will hold the interpolated values.
```{r interp setup}
# set dimesions of output array, and define "marginal" x- and y-values
nx <- 1078; ny <- 900
x <- seq(-5770000, 5000000, by=10000)
y <- seq(-4510000, 4480000, by=10000)
# define a vector and array that will contiain the interpolated values
interp_var <- rep(NA, ntarg)
interp_mat <- array(NA, dim=c(nx, ny))
```
Get the array subscripts for each point. This is similar to the work done in reshaping a "short" data frame to an array, as in Week 4
```{r targ subs}
# get array subscripts for each target point
j <- sapply(na10km_v2$x, function(c) which.min(abs(x-c)))
k <- sapply(na10km_v2$y, function(c) which.min(abs(y-c)))
head(cbind(j,k,na10km_v2$lon,na10km_v2$lat))
tail(cbind(j,k,na10km_v2$lon,na10km_v2$lat))
```
Now do bilinear interpolation using the `fields` package:
```{r bilinear}
# bilinear interpolation from fields package
control_dat <- list(x=lon, y=lat, z=etopo1_array)
interp_var <- interp.surface(control_dat, cbind(na10km_v2$lon,na10km_v2$lat))
# put interpolated values into a 2-d array
interp_mat[cbind(j,k)] <- interp_var[1:ntarg]
```
Get a quick map:
```{r interp map}
grid <- expand.grid(x=x, y=y)
cutpts <- c(-7000, -6000, -4000, -2000, 0, 500, 1000, 1500, 2000, 3000, 4000, 5000)
levelplot(interp_mat ~ x * y, data=grid, at=cutpts, cuts=11, pretty=TRUE,
col.regions=topo.colors(12))
```
At this point, `interp_mat` could be written out as a variable in a netCDF file (along with dimension and attribute data). It is also possible to make a data frame of the interpolated data, which could be written out as a .csv file:
```{r make df}
# make a dataframe of the interpolated elevations
out_df <- data.frame(cbind(na10km_v2$x, na10km_v2$y, na10km_v2$lon, na10km_v2$lat, interp_var))
names(out_df) <- c("x", "y", "lon", "lat", "elev")
head(out_df); tail(out_df)
```
A quick `ggplot2` map can be made:
```{r gg1}
# ggplot or the interpolated data
library(ggplot2)
ggplot(data = out_df, aes(x = x, y = y)) +
geom_tile(aes(fill = interp_var)) +
scale_fill_gradientn(colors = terrain.colors(12)) +
theme_bw()
```
There are two things about this map: The first is that the yellow/green color scheme generated by the `terrain.colors()` function will not be interpretable by many color-deficient viewers. The second is that although plotting the data relative to the projected x- and y-values gives a map that "looks right", it is not obvious how to put a longitude by latitude graticule on the maps.
The solution is to turn the dataframe into an `sf` object, using the projected x- and y-values as coordinates, and adding the approprite coordinate reference system values.
```{r gg2}
# make an sf object
NA_10km_v2_elev_sf <- st_as_sf(out_df, coords = c("x", "y"))
NA_10km_v2_elev_sf
```
```{r gg3}
# add (projected) CRS
st_crs(NA_10km_v2_elev_sf) <-
st_crs("+proj=laea +lon_0=-100 +lat_0=50 +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +units=m +no_defs")
NA_10km_v2_elev_sf
```
Projected `ggplot2` map with a grayscale color scheme:
```{r gg4}
# ggplot2 map
pal <- rev(brewer.pal(9, "Greys"))
ggplot(data = NA_10km_v2_elev_sf, aes(x = "lon", y = "lat")) +
geom_sf(aes(color = elev), size = .0001) +
scale_color_gradientn(colors = pal) +
labs(x = "Longitude", y = "Latitude") +
scale_x_discrete(breaks = seq(160, 360, by=10)) +
scale_y_discrete(breaks = seq(0, 90, by=10)) +
theme_bw()
```
[[Back to top]](lec07.html)
# Readings #
In addition to the web pages listed in the *Maps in R& topic, geospatial analysis is discussed in the following Bookdown eBooks:
Lovelace, R., J. Nowosad, and J. Muenchow, 2019, *Geocomputation with R* [[https://bookdown.org/robinlovelace/geocompr/]](https://bookdown.org/robinlovelace/geocompr/)
Pebesma, E. and R. Bivand, 2020, *Spatial Data Science* [[https://keen-swartz-3146c4.netlify.com]](https://keen-swartz-3146c4.netlify.com)
Also worth looking at is chapter 3 in Bivand, R., et al. (2013) *Applied Spatial Data Analysis with R*, 2nd edition, which describes the `sp` package approach to mapping. (Search for the eBook version on the UO Library page [[http://library.uoregon.edu]](http://library.uoregon.edu)) (The whole book is worth looking at, but Ch. 3 is the key reference for the display of spatial data (i.e. mapping) using the older `sp` package.)
[[Back to top]](lec07.html)