lec07.Rmd

---
title: "Geospatial analysis in R"
output: 
  html_document:
    fig_caption: no
    number_sections: yes
    toc: yes
    toc_float: false
    collapsed: no
---

```{r set-options, echo=FALSE}
options(width = 105)
knitr::opts_chunk$set(dev='png', dpi=300, cache=FALSE)
pdf.options(useDingbats = TRUE)
klippy::klippy(position = c('top', 'right'))
```
<p><span style="color: #00cc00;">NOTE:  This page has been revised for Winter 2021, but may undergo further edits.</span></p>
# Introduction #

The `sf` package (and `sp` before it), with it's direct links to the `GDAL`, `GEOS`, and `PROJ` libraries give **R** a powerful geospatial analysis capability, and at this point in the development of packages and methods for handling spatial, practically anything that can be accomplished in commercial or open-source GIS software can be done in **R**.  Because **R** is script-oriented, this has the added benefit of reproducibility, inasmuch as scripts can be saved and reused, while analysis done in menu-focused GUIs are not.

The examples here cover some of the typical things one might want to do in a GIS package, with an emphasis on tasks arising in physical geography.  

# Extract data from a raster #

This example demonstrates how to extract data from a raster for a set of target points contained in a .csv file.  In this case, the raster is the Ramankutty and Foley potential natural vegetation data set ([https://www.nelson.wisc.edu/sage/data-and-models/global-land-use/index.php](Shttps://www.nelson.wisc.edu/sage/data-and-models/global-land-use/index.php)), and the target points are the Global Charcoal Database (GCDv3) site locations ([http://www.gpwg.paleofire.org](http://www.gpwg.paleofire.org)), with the task being to "look up" the vegetation at each site.

Load some packages:

```{r begin extract, results="hide", message=FALSE}
# load packages
library(raster)
library(rasterVis)
library(RColorBrewer)
library(sf)
```

## Read the data sets -- source and target ##

User the `raster` package to read the vegetation data, which is stored in a "netCDF" file, a self-documenting, machine-independent format that a lot of Earth-system science data (like climate-model output, or satellite remote-sensing data) are stored in.  Note that the `ncdf4` package is automatically loaded if the source file is a netCDF file.

```{r read veg}
# read potential natural vegetation data sage_veg30.nc:
# modify the following path to reflect local files
vegtype_path <- "/Users/bartlein/Documents/geog495/data/nc_files/"
vegtype_name <- "sage_veg30.nc"
vegtype_file <- paste(vegtype_path, vegtype_name, sep="")
vegtype <- raster(vegtype_file, varname="vegtype")
vegtype
```

Plot the vegetation data using a `rasterVis` levelplot: 

```{r plot veg}
mapTheme <- rasterTheme(region=rev(brewer.pal(8,"Greens")))
levelplot(vegtype, margin=FALSE, par.settings=mapTheme,
                 main="Potential Natural Vegetation")
```

Read the charcoal data locations:

```{r read charcoal}
# read GCDv3 sites
# modify the following path to reflect local files
csv_path <- "/Users/bartlein/Documents/geog495/data/csv/"
csv_name <- "v3i_nsa_globe.csv"
csv_file <- paste(csv_path, csv_name, sep="")
gcdv3 <- read.csv(csv_file) 
str(gcdv3)
plot(gcdv3$Lon, gcdv3$Lat, pch=16, cex=0.5, col="blue")
```

In order to use the `extract()` function from `raster`, the target points must be turned into a `sf` object data set.

```{r sf1}
# turn into an sf object
gcdv3_sf <- st_as_sf(gcdv3, coords = c("Lon", "Lat"))
class(gcdv3_sf)
gcdv3_sf
```

Add the CRS:

```{r sf2}
st_crs(gcdv3_sf) <- st_crs("+proj=longlat")
```

Plot the target points on top of ths source map.  The `as_Spatial()` function converts an `sf` object to a `sp` spatial-data type on the fly.

```{r plot both}
plt <- levelplot(vegtype, margin=FALSE, par.settings=mapTheme,
                 main="Potential Natural Vegetation")
plt + layer(sp.points(as_Spatial(gcdv3_sf), col="blue", pch=16, cex=0.5))
```

## Extract data at target points ##

Now extract the data for the target points:

```{r extract}
# extract data from the raster at the target points
gcdv3_vegtype <- extract(vegtype, gcdv3_sf, method="simple")
class(gcdv3_vegtype)
head(gcdv3_vegtype)
```

Make a dataframe of the extracted data that could be saved as a .csv file, and plot it:

```{r make dataframe}
pts <- data.frame(gcdv3$Lon, gcdv3$Lat, gcdv3_vegtype)
names(pts) <- c("Lon", "Lat", "vegtype")
head(pts, 10)
plotclr <- rev(brewer.pal(8,"Greens"))
plotclr <- c("#AAAAAA", plotclr)
cutpts <- c(0, 2, 4, 6, 8, 10, 12, 14, 16)
color_class <- findInterval(gcdv3_vegtype, cutpts)
plot(pts$Lon, pts$Lat, col=plotclr[color_class+1], pch=16)
```

Plot the extracted data at the target points on top of the source points.  If the extraction is successful, the target-point colors should dissappear against the background.

```{r plot over}
plt <- levelplot(vegtype, margin=FALSE, par.settings=mapTheme,
                 main="Potential Natural Vegetation")
plotclr <- rev(brewer.pal(8,"Greens"))

cutpts <- c(0, 2, 4, 6, 8, 10, 12, 14, 16)
color_class <- findInterval(gcdv3_vegtype, cutpts)
plt + layer(sp.points(as_Spatial(gcdv3_sf), col=plotclr[color_class], pch=16, cex=0.6)) + 
  layer(sp.points(as_Spatial(gcdv3_sf), col="black", pch=1, cex=0.6))
```

Looks ok.

[[Back to top]](lec07.html)

## A second example -- explicit cell selection ##

Here's a second example of extracting values from an array, by referencing the specific cell or array element that a target point falls in.  In this example, a netCDF file of bioclimatic variables is read using `ncdf4` and the values of `mtco` (mean temperature of the coldest month) are extracted.

```{r library2, cache=FALSE}
library(ncdf4)
library(lattice)
library(classInt)
library(RColorBrewer)
```

```{r ncdf4 read mtco, echo=TRUE, eval=TRUE, cache=FALSE}
# set path and filename
# modify the following path to reflect local files
ncpath <- "/Users/bartlein/Documents/geog495/data/nc_files/"
ncname <- "cru10min30_bio.nc"  
ncfname <- paste(ncpath, ncname, sep="")
```
```{r open mtco, echo=TRUE, eval=TRUE, cache=FALSE}
# open a netCDF file
ncin <- nc_open(ncfname)
print(ncin)
```
```{r ncdf4 mtco}
# get longitude and latitude
lon <- ncvar_get(ncin,"lon")
nlon <- dim(lon)
head(lon)
lat <- ncvar_get(ncin,"lat")
nlat <- dim(lat)
head(lat)
print(c(nlon,nlat))
```
```{r ncdf4 get data}
# get the mtco data
mtco <- ncvar_get(ncin,"mtco")
dlname <- ncatt_get(ncin,"mtco","long_name")
dunits <- ncatt_get(ncin,"mtco","units")
fillvalue <- ncatt_get(ncin,"mtco","_FillValue")
dim(mtco)
```
```{r recode missing mtco}
mtco[mtco==fillvalue$value] <- NA
```

Close the netCDF file using the `nc_close()` function.

```{r ncdf4 close}
# close the netCDF file
nc_close(ncin)
```

Plot the control data.

```{r levelplot1, fig.width=5, fig.height=3}
# levelplot of the slice
grid <- expand.grid(lon=lon, lat=lat)
cutpts <- c(-50,-40,-30,-20,-10,0,10,20,30,40,50)
levelplot(mtco ~ lon * lat, data=grid, at=cutpts, cuts=11, pretty=TRUE, 
  col.regions=(rev(brewer.pal(10,"RdBu"))))
```

Now extract the data for the target points:

Get the indices (j's and k's) of the grid cell that each target point lies in.  For each target point, figure out which column (`j`) and row (`k`) a target point falls in.  This code is basically the same as that used in reshaping a "short" data frame into an array.  The function that is defined and executed within the `sapply()` function figures out which column (`j`) and row ('k') in the control-data array a target point falls in.  The `j`'s and `k`'s together describe the control-data grid cells the individual target points fall in.

```{r get jk}
j <- sapply(gcdv3$Lon, function(x) which.min(abs(lon-x)))
k <- sapply(gcdv3$Lat, function(x) which.min(abs(lat-x)))
head(cbind(j,k)); tail(cbind(j,k))
```

Get the data for each j, k combination.  The way to do this is the convert the `mtco` array to a vector, and then calculate an index `jk` for each target value:

```{r get data}
mtco_vec <- as.vector(mtco)
jk <- (k-1)*nlon + j
gcdv3_mtco <- mtco_vec[jk]
head(cbind(j,k,jk,gcdv3_mtco,lon[j],lat[k]))
```

```{r make dataframe mtco}
gcdv3_mtco[is.na(gcdv3_mtco)] <- -99
pts <- data.frame(gcdv3$Lon, gcdv3$Lat, gcdv3_mtco)
names(pts) <- c("Lon", "Lat", "mtco")
head(pts, 20)
```
Plot the extracted values of `mtco`.  To do this, the colors for plotting the different levels of `mtco` are generated from and `RColorBrewer` palette, and augmented by gray (`"#AAAAAA"`) to handle missing values (i.e. from marine charcoal records, or those from sites "off" the cru10min30 grid).

```{r plot mtco}
plotclr <- rev(brewer.pal(10,"RdBu"))
plotclr <- c("#AAAAAA", plotclr)
cutpts <- c(-50,-40,-30,-20,-10,0,10,20,30,40,50)
color_class <- findInterval(gcdv3_mtco, cutpts)
plot(gcdv3$Lon, gcdv3$Lat, col=plotclr[color_class+1], pch=16)
```
[[Back to top]](lec07.html)

# Clipping/Trimming/Point-in-polygon analyses #

A common problem arises in dealing with spatial data is the "clipping" or "trimming" problem, in which one wants to know whether one or more points lie within the outlines of a particular polygon, or in the case of multiple polygons, which polygon an individual point lies in.  More formally, this is known as the "point in polygon" problem.  Here the process of clipping the na_10km_v2 climate grid to a tree-species shape file, in particular, that for *Picea mariana* (black spruce) is demonstrated.  What we want is a list of climate data set points that lie withing the range of *P. mariana*.

## Read the polygon and target-point files ##

Read the shapefile for *Picea mariana*

```{r read picae}
# read the shape file for Picea Mariana
# modify the following path to reflect local files
shp_file <- "/Users/bartlein/Documents/geog495/data/shp/picemari.shp"
picea_sf <- st_read(shp_file)
picea_sf
```

Plot the outline.  

```{r plot picea}
plot(st_geometry(picea_sf))
```

Read the na10km_v2 points as a .csv file (for illustration, in practice it would be more efficient to read it as a netCDF file).  This file contains the grid-point locations of a present-day climate data set, and we want a list of the points that fall within the range limits of Picea mariana.

```{r read na10km_v2}
# read the na10km_v2 points (as a .csv file)
csv_file <- "/Users/bartlein/Documents/geog495/data/csv/na10km_v2.csv"
na10km_v2 <- read.csv(csv_file)
str(na10km_v2)
```

We don't need data from east of 45 W, so trim the data set.

```{r }
na10km_v2 <- na10km_v2[na10km_v2$lon <= -45.0, ]
str(na10km_v2)
```

Make an `sf` object of the na10km_v2 data-point locations, and plot it:

```{r na10km dataframe}
# make an sf object
na10km_v2_sf <- st_as_sf(na10km_v2, coords = c("lon", "lat"))
st_crs(na10km_v2_sf) <- st_crs("+proj=longlat")
na10km_v2_sf
plot(st_geometry(na10km_v2_sf), pch=16, cex=0.2) # takes a little while
```

## Overlay the points onto the polygons ##

Now overlay the points onto the polygons.  The `st_join()` function can take a little while to run.

```{r over}
# overlay the two
st_crs(picea_sf) <- st_crs(na10km_v2_sf) # make sure CRS's exactly match
picea_pts_sf <- st_join(na10km_v2_sf, picea_sf)
picea_pts_sf <- na.omit(picea_pts_sf)
picea_pts_sf
```

The warning `although coordinates are longitude/latitude, st_intersects assumes that they are planar` arises because in fact, longitude and latitude are anisotropic, and it would be better to do this with projected data with isotropic coordinates (i.e. x- and y- coordinates in metres.)

```{r}
plot(st_geometry(na10km_v2_sf), pch=16, cex=0.2, axes=TRUE)
plot(st_geometry(picea_pts_sf), pch=16, cex=0.2, col="green", add=TRUE)
```

[[Back to top]](lec07.html)

# Gridding or rasterizing point data #

Another common task is to grid or rasterize a set of point data, creating a gridded data set of counts, averages, minima, maxima, etc.  This can be illustrated using the FPA-FOD daily fire-fire start data set:  Spatial wildfire occurrence data for the United States, 1992-2013/Fire Program Analysis Fire-Occurrence Database [FPA_FOD_20150323] (3nd Edition) (Short, K.C., 2014, Earth Syst. Sci. Data, 6, 1-27) – [http://www.fs.usda.gov/rds/archive/Product/RDS-2013-0009.3/](http://www.fs.usda.gov/rds/archive/Product/RDS-2013-0009.3/).  The idea here is to calculate the number and average area of the fires that occured in 0.5-degree grid cells that cover the coterminous U.S.

The analysis here uses an approach from the `sp` package.

## Read the data sets, and create empty rasters ##

```{r rasterize begin}
library(raster)
library(rasterVis)
```

Read the fire-start data:

```{r read fire}
# read the FPA-FOD fire-start data
# modify the following path to reflect local files
csv_path <- "/Users/bartlein/Dropbox/DataVis/working/data/csv_files/"
csv_name <- "fpafod_1992-2013.csv"
csv_file <- paste(csv_path, csv_name, sep="")
fpafod <- read.csv(csv_file) # takes a while
str(fpafod)
```

Convert the fire-start data to a `SpatialPointsDataFrame`

```{r fire dataframe}
fpafod_coords <- cbind(fpafod$longitude, fpafod$latitude)
fpafod_pts <- SpatialPointsDataFrame(coords=fpafod_coords, data=data.frame(fpafod$area_ha))
names(fpafod_pts) <- "area_ha"
```

Create a couple of empty rasters to hold the rasterized data:

```{r create rasters}
# create (empty) rasters
cell_size <- 0.5
lon_min <- -128.0; lon_max <- -65.0; lat_min <- 25.5; lat_max <- 50.5
ncols <- ((lon_max - lon_min)/cell_size)+1; nrows <- ((lat_max - lat_min)/cell_size)+1
us_fire_counts <- raster(nrows=nrows, ncols=ncols, xmn=lon_min, xmx=lon_max, ymn=lat_min, ymx=lat_max, res=0.5, crs="+proj=longlat +datum=WGS84")
us_fire_counts
us_fire_area <- raster(nrows=nrows, ncols=ncols, xmn=lon_min, xmx=lon_max, ymn=lat_min, ymx=lat_max, res=0.5, crs="+proj=longlat +datum=WGS84")
us_fire_area
```

## Rasterize the data ##

Now rasterize the data, first as counts (number of fires in each grid cell), and then as the average size of the fires in each grid cell.  Also plot the data (both on a log10 scale:

```{r rasterize}
# rasterize
us_fire_counts <- rasterize(fpafod_coords, us_fire_counts, fun="count")
us_fire_counts
plot(log10(us_fire_counts), col=brewer.pal(9,"BuPu"), sub="log10 Number of Fires")

us_fire_area <- rasterize(fpafod_pts, us_fire_area, fun=mean)
us_fire_area
plot(log10(us_fire_area$area_ha), col=brewer.pal(9,"YlOrRd"), sub="log10 Mean Area")
```

## Write the rasterized data out as a netCDF file ##

Write the two rasterized data sets out as variables in a netCDF data set.  Create some variables, and replace the R `NAs` with netCDF fillvalues:

```{r netCDF setup}
# make necessary vectors and arrays
lon <- seq(lon_min+0.25, lon_max-0.25, by=cell_size)
lat <- seq(lat_max-0.25, lat_min+0.25, by=-1*cell_size)
print(c(length(lon), length(lat)))

fillvalue <- 1e32
us_fire_counts2 <- t(as.matrix(us_fire_counts$layer, nrows=ncols, ncols=nrows))
dim(us_fire_counts2)
us_fire_counts2[is.na(us_fire_counts2)] <- fillvalue

us_fire_area2 <- t(as.matrix(us_fire_area$area_ha, nrows=ncols, ncols=nrows))
dim(us_fire_area2)
us_fire_area2[is.na(us_fire_area2)] <- fillvalue
```

Write out a netCDF file:

```{r write netCDF}
# write out a netCDF file
library(ncdf4)

# path and file name, set dname
# modify the following path to reflect local files
ncpath <- "/Users/bartlein/Dropbox/DataVis/working/data/nc_files/"
ncname <- "us_fires.nc"
ncfname <- paste(ncpath, ncname, sep="")

# create and write the netCDF file -- ncdf4 version
# define dimensions
londim <- ncdim_def("lon","degrees_east",as.double(lon))
latdim <- ncdim_def("lat","degrees_north",as.double(lat))

# define variables

dname <- "fpafod_counts"
dlname <- "Number of fires, 1992-2013"
v1_def <- ncvar_def(dname,"1",list(londim,latdim),fillvalue,dlname,prec="single")
dname <- "fpafod_mean_area"
dlname <- "Average Fire Size, 1992-2013"
v2_def <- ncvar_def(dname,"ha",list(londim,latdim),fillvalue,dlname,prec="single")

# create netCDF file and put arrays
ncout <- nc_create(ncfname,list(v1_def, v2_def),force_v4=TRUE)

# put variables
ncvar_put(ncout,v1_def,us_fire_counts2)
ncvar_put(ncout,v2_def,us_fire_area2)

# put additional attributes into dimension and data variables
ncatt_put(ncout,"lon","axis","X")
ncatt_put(ncout,"lat","axis","Y")

# add global attributes
ncatt_put(ncout,0,"title","FPA-FOD Fires")
ncatt_put(ncout,0,"institution","USFS")
ncatt_put(ncout,0,"source","http://www.fs.usda.gov/rds/archive/Product/RDS-2013-0009.3/")
ncatt_put(ncout,0,"references", "Short, K.C., 2014, Earth Syst. Sci. Data, 6, 1-27")
history <- paste("P.J. Bartlein", date(), sep=", ")
ncatt_put(ncout,0,"history",history)
ncatt_put(ncout,0,"Conventions","CF-1.6")

# Get a summary of the created file:
ncout

# close the file, writing data to disk
nc_close(ncout)
```

[[Back to top]](lec07.html)

# Interpolating/regridding #

Another common task involves tranferring values from one gridded data set to another.  When the grids are identical, this is trivial, but when the "target" grid is different from the source or "control" grid, this involves interpolation (as does also the case when the target points are irregularly distributed).  The most widely used method for interpolation within a control grid is *bilinear interpolation*, which involves finding the control grid points that surround a particular target point, and then simultaneously (linearlly) interplating in the x- and y-directions.  The method is implemented in the `raster` package, and relatively fast version is implemented in the `fields` package.

The example here uses a lower-resolution verions of the `ETOPO1` global DEM as the source file, and the locations of the (land-only) points in the na10km_v2 grid as the targets.  These are read in here from a .csv file, but they also could have come from a netCDF file.

## Open the "control" netCDF and target .csv files ##

Load the appropriate packages.

```{r ncdf4 package, messages=FALSE, warning=FALSE, results='hide'}
# load the ncdf4 package
library(ncdf4)
library(lattice)
library(fields)
```

Read the etopo1 netCDF file.  This particular file is one in which the original 30-sec data have been aggregated (by averaging) to six minutes or one-tenth of a degree (to speed up the execution of the examples).  In practice, one would work with the original higher resolution data.  Do some setup an open the file, and list its contents.

```{r ncdf4 openMac, echo=TRUE, eval=TRUE, cache=FALSE}
# set path and filename
# modify the following path to reflect local files
ncpath <- "/Users/bartlein/Dropbox/DataVis/working/data/nc_files/"
ncname <- "etopo1_ig_06min.nc"
ncfname <- paste(ncpath, ncname,  sep="")
dname <- "elev"
```

```{r open etopo1, cache=FALSE}
# open a netCDF file
ncin <- nc_open(ncfname)
print(ncin)
```

Get the "control" longitudes and latitudes.

```{r etopo1 get lons and lats}
# get longitude and latitude
lon <- ncvar_get(ncin,"lon")
nlon <- dim(lon)
head(lon)
lat <- ncvar_get(ncin,"lat")
nlat <- dim(lat)
head(lat)
print(c(nlon,nlat))
```

Now read the elevations, and various attributes:

```{r get elev}
# get elevations
etopo1_array <- ncvar_get(ncin,dname)
dlname <- ncatt_get(ncin,dname,"long_name")
dunits <- ncatt_get(ncin,dname,"units")
fillvalue <- ncatt_get(ncin,dname,"_FillValue")
dsource <- ncatt_get(ncin, "elev", "source")
dim(etopo1_array)
```

```{r etopo1 get global attributes}
# get global attributes
title <- ncatt_get(ncin,0,"title")
institution <- ncatt_get(ncin,0,"institution")
datasource <- ncatt_get(ncin,0,"source")
history <- ncatt_get(ncin,0,"history")
Conventions <- ncatt_get(ncin,0,"Conventions")
```

Close the netCDF file using the `nc_close()` function.

```{r ncdf4 close etopo1}
# close the netCDF file
nc_close(ncin)
```

Produce a quick map to check that the data make sense.  (*Always do this!*)

```{r levelplot elev, fig.width=5, fig.height=3, cache=TRUE}
# levelplot of elevations
grid <- expand.grid(lon=lon, lat=lat)
cutpts <- c(-7000, -6000, -4000, -2000, 0, 500, 1000, 1500, 2000, 3000, 4000, 5000)
levelplot(etopo1_array ~ lon * lat, data=grid, at=cutpts, cuts=11, pretty=TRUE,
  col.regions=topo.colors(12))
```

Open and read the .csv file containing the "target"" points.

```{r read na10km_v2 target points}
# read na10km_v2 grid-point locations -- land-points only
# modify the following path to reflect local files
csv_path <- "/Users/bartlein/Dropbox/DataVis/working/data/csv_files/"
csv_name <- "na10km_v2_pts.csv"
csv_file <- paste(csv_path, csv_name, sep="")
na10km_v2 <- read.csv(csv_file)
str(na10km_v2)
```

Get the number of target points:

```{r get ntarg}
# get number of target points
ntarg <- dim(na10km_v2)[1]
ntarg
```

## Interpolation ##

Set up to do the interpolation.  Generate x's and y's for the target grid.  In practice, these might be read in from a netCDF file.  Here we need them to define an array that will hold the interpolated values.

```{r interp setup}
# set dimesions of output array, and define "marginal" x- and y-values
nx <- 1078; ny <- 900
x <- seq(-5770000, 5000000, by=10000)
y <- seq(-4510000, 4480000, by=10000)

# define a vector and array that will contiain the interpolated values
interp_var <- rep(NA, ntarg)
interp_mat <- array(NA, dim=c(nx, ny))
```

Get the array subscripts for each point.  This is similar to the work done in reshaping a "short" data frame to an array, as in Week 4

```{r targ subs}
# get array subscripts for each target point
j <- sapply(na10km_v2$x, function(c) which.min(abs(x-c)))
k <- sapply(na10km_v2$y, function(c) which.min(abs(y-c)))
head(cbind(j,k,na10km_v2$lon,na10km_v2$lat))
tail(cbind(j,k,na10km_v2$lon,na10km_v2$lat))
```

Now do bilinear interpolation using the `fields` package:

```{r bilinear}
# bilinear interpolation from fields package
control_dat <- list(x=lon, y=lat, z=etopo1_array)
interp_var <- interp.surface(control_dat, cbind(na10km_v2$lon,na10km_v2$lat))

# put interpolated values into a 2-d array
interp_mat[cbind(j,k)] <- interp_var[1:ntarg]
```

Get a quick map:

```{r interp map}
grid <- expand.grid(x=x, y=y)
cutpts <- c(-7000, -6000, -4000, -2000, 0, 500, 1000, 1500, 2000, 3000, 4000, 5000)
levelplot(interp_mat ~ x * y, data=grid, at=cutpts, cuts=11, pretty=TRUE,
  col.regions=topo.colors(12))
```

At this point, `interp_mat` could be written out as a variable in a netCDF file (along with dimension and attribute data).  It is also possible to make a data frame of the interpolated data, which could be written out as a .csv file:

```{r make df}
# make a dataframe of the interpolated elevations
out_df <- data.frame(cbind(na10km_v2$x, na10km_v2$y, na10km_v2$lon, na10km_v2$lat, interp_var))
names(out_df) <- c("x", "y", "lon", "lat", "elev")
head(out_df); tail(out_df)
```

A quick `ggplot2` map can be made:

```{r gg1}
# ggplot or the interpolated data
library(ggplot2)
ggplot(data = out_df, aes(x = x, y = y)) + 
  geom_tile(aes(fill = interp_var)) + 
  scale_fill_gradientn(colors = terrain.colors(12)) + 
  theme_bw()
```

There are two things about this map:  The first is that the yellow/green color scheme generated by the `terrain.colors()` function will not be interpretable by many color-deficient viewers.  The second is that although plotting the data relative to the projected x- and y-values gives a map that "looks right", it is not obvious how to put a longitude by latitude graticule on the maps.

The solution is to turn the dataframe into an `sf` object, using the projected x- and y-values as coordinates, and adding the approprite coordinate reference system values.

```{r gg2}
# make an sf object
NA_10km_v2_elev_sf <- st_as_sf(out_df, coords = c("x", "y"))
NA_10km_v2_elev_sf
```

```{r gg3}
# add (projected) CRS
st_crs(NA_10km_v2_elev_sf) <- 
  st_crs("+proj=laea +lon_0=-100 +lat_0=50 +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +units=m +no_defs")
NA_10km_v2_elev_sf
```

Projected `ggplot2` map with a grayscale color scheme:

```{r gg4}
# ggplot2 map
pal <- rev(brewer.pal(9, "Greys"))
ggplot(data = NA_10km_v2_elev_sf, aes(x = "lon", y = "lat")) + 
  geom_sf(aes(color = elev), size = .0001) + 
  scale_color_gradientn(colors = pal) +
  labs(x = "Longitude", y = "Latitude") +
  scale_x_discrete(breaks = seq(160, 360, by=10)) +
  scale_y_discrete(breaks = seq(0, 90, by=10)) +
  theme_bw()
```

[[Back to top]](lec07.html)

# Readings #

In addition to the web pages listed in the *Maps in R& topic, geospatial analysis is discussed in the following Bookdown eBooks:

Lovelace, R., J. Nowosad, and J. Muenchow, 2019, *Geocomputation with R* [[https://bookdown.org/robinlovelace/geocompr/]](https://bookdown.org/robinlovelace/geocompr/)

Pebesma, E. and R. Bivand, 2020, *Spatial Data Science* [[https://keen-swartz-3146c4.netlify.com]](https://keen-swartz-3146c4.netlify.com)

Also worth looking at is chapter 3 in Bivand, R., et al. (2013) *Applied Spatial Data Analysis with R*, 2nd edition, which describes the `sp` package approach to mapping.  (Search for the eBook version on the UO Library page [[http://library.uoregon.edu]](http://library.uoregon.edu))  (The whole book is worth looking at, but Ch. 3 is the key reference for the display of spatial data (i.e. mapping) using the older `sp` package.)

[[Back to top]](lec07.html)