foundations.qmd

# Foundations {#sec-foundations}

```{r source_r, echo = FALSE}
source("_common.R")
```

## Goals and Outcomes

* Understand fundamental spatial data structures and libraries in **R**.
* Become familiar with coordinate reference systems.
* Geographic I/O (input/output)

Throughout this section we'll use the following packages:
```{r}
#| warning: false
library(dataRetrieval)
library(tmap)
library(tmaptools)
library(dplyr)
library(tigris)
library(sf)
library(mapview)
mapviewOptions(fgb=FALSE)
library(ggplot2)
library(spData)
library(terra)
library(stars)
library(Rspatialworkshop)
library(readr)
library(cowplot)
library(knitr)
library(dplyr)
library(rnaturalearth)
library(osmdata)
library(terra)
library(stars)
```

## Background on data structures

:::{.callout-note}
Much of this background on data structures is borrowed from Mike Johnson's [Introduction to Spatial Data Science](https://mikejohnson51.github.io/spds/) and lecture material from our [AWRA 2022 Geo Workshop](https://github.com/mhweber/AWRA2022GeoWorkshop)
:::

Before we dive into spatial libraries, it's worth a quick review of relevant data structures in R - this will be a whirlwind overview, assuming most everyone is familiar with using R already.

- You, computers, and software 'understand' values in particular and different ways

- Computers convert bytes --> hex --> value
  + Humans read values
  + Software reads Hex bytes
  + Hardware reads Binary bytes

```{r, fig.align='center', echo = FALSE, out.width="75%" }
knitr::include_graphics("img/values.PNG")
```

:::{.callout-note}
What's the difference between 10 and '10'?

- **To us**: meaning
- **To software**: how it is handled
- **To a computer**: nothing

We need human-defined (computer-guessable) data types
:::

Fundamental data types
```{r}
# Numeric
typeof(1.9)
# Integer
typeof(1L)
# Boolean
typeof(TRUE)
# Character
typeof("Welcome")
```
Storing more than one value requires a structure.

Values can be structured in ways such as:

- vectors
- dimensions/attributes
- data.frames

And data.frames can be operated on with functions such as:

- filter
- select
- mutate
- summarize
- group_by

### Vectors

:::{.callout-note}
'vector' has two meanings when working with GIS data and working in R!

- geographic vector data (which we'll explore)
- `vector` class (what we're talking about here)

geographic vector data is a *data model*, and the `vector` class is an R class like `data.frame` and `matrix`
:::

**Vectors** come in two flavors:

  1. atomic
  2. list

- atomic vectors elements must have the same **type**

- lists elements can have different **types**

### Atomic vectors

```{r}
# Numeric
dbl_vec = c(1.9, 2, 3.5)
typeof(dbl_vec)
length(dbl_vec)
# Logical
lg_vec = c(TRUE, FALSE, F, T)
typeof(lg_vec)
length(lg_vec)
```

**Coercion**

- *type* is a property of a vector
- When you try to combine different *types* they'll be *coerced* in the following fixed order:
  + character => double => integer => logical
- Coercion occurs automatically but generates a warning and a missing value when it fails

```{r}
c("a", 1)
c("a", TRUE)
c(4.5, 1L)
c("1", 18, "GIS")
as.numeric(c("1", 18, "GIS"))
as.logical(c("1", 18, "GIS"))
```

*Subsetting atomic vectors*
```{r}
# Atomic numeric vector
(x = c(3.4, 7, 18, 9.6))

# Third Value
x[3]

# Third and Fourth value
x[c(3,4)]

# Drop the third value
x[-3]

# Keep the 1 and 2 value, but drop 3 and 4
x[c(T,T,F,F)]
```

### Matrix

- A matrix is 2D atomic (row, column)
  + Same data types
  + Same column length
  
This is how spatial *raster* data is structured

Subsetting matrices uses row,column (i,j) syntax
```{r}
(x = matrix(1:9, nrow = 3))
x[3,]
x[,3]
x[3,3]
```

### Array

- Array is a 3d Atomic (row, column, slice)

This is how spatial *raster* data with a time dimension is structured

```{r}
(array(c(1:12), dim = c(3,2,2)))
```

Subsetting arrays uses row, column, slice syntax (i,j,z)
```{r}
(x = array(1:12, dim = c(2,2,3)))
x[1,,]
x[,1,]
x[,,1]
```

### Lists

Eash list element can be any data type
```{r}
(my_list <- list(
  matrix(1:4, nrow = 2), 
  "GIS is great!", 
  c(TRUE, FALSE, TRUE), 
  c(2.3, 5.9)
))
```

```{r}
typeof(my_list)
```

Subsetting Lists

- Each element of a list can be accessed with the [[ operator
```{r}
my_list[[1]]
my_list[[1]][1,2]
```

### Data Frames

- `data.frame` is built on the `list` structure in R
- length of each atomic or list vector has to be the same
- This gives `data.frame` objects rectangular structure so they share properties of both matrices and lists
```{r}
(df1 <- data.frame(name = c("Me", "Tim", "Sarah"),
                  age  = c(53,15,80),
                  retired = c(F,F,T)))
typeof(df1)
```

Subsetting a `data.frame`
```{r}
df1[1,2]

# or like a list
df1[[2]]

# or with column name operator
df1$name
```

### Data manipulation

- `data.frame` manipulation is all based on SQL queries
- R abstracts the SQL logic and provides function-ized methods
- `dplyr` in the `tidyverse` ecosystem provides the 'grammar of data manipulation` approach we'll use in this workshop

Data manipulation *verbs*:

- Primary:
  + `select()`: keeps or removes variables based on names
  + `filter()`: keeps or removes observations based on values
_ Manipulation:
  + `mutate()`: adds new variables that are functions of existing variables
  + `summarise()`: reduces multiple values down to a single summary
  + `arrange()`: changes ordering of the rows
- Grouping:
  + `group_by()`: combine with any or all of the above to perform manipulation 'by group'
  
### Pipe operator

- The pipe operator (native R pipe operator `|>` or `magrittr` pipe operator `%>%`) provides a more concise and expressive coding experience
- The pipe passes the object on the left hand side of the pipe into the first argument of the right hand function
- To be  `|>`  compatible, the data.frame is ALWAYS the fist argument to `dplyr` verbs

A demonstration using `dataRetrieval` package stream gage data from USGS:
```{r}
flows <- dataRetrieval::readNWISdv(
  siteNumbers = '14187200',
  parameterCd = "00060"
)  |>  
  dataRetrieval::renameNWISColumns()
  
dplyr::glimpse(flows)
```

::: {.callout-note appearance="simple" icon="false"}
### Exercise
What is the `::` after the package name and before the function in the R code above doing?  How would you find out more about it?
:::

::: {.callout-note appearance="simple" icon="false" collapse="true"}
### Solution
`::` specifies the namespace (i.e., package) associated with a particular function.

You can get help on it the same way you get help on anything in R:
```{r}
#| eval: false
?'::'
# or
help("::")
```
:::

### Filter

- `filter()` takes logical (binary) expressions and returns the rows in which all conditions are TRUE.

Filter on a single condition:
```{r}
flows |> 
  dplyr::filter(Flow > 900) |>
  dplyr::glimpse()
```

Or multiple conditions:
```{r}
flows  |> 
  dplyr::filter(Flow > 900, Date > as.Date("2010-01-01"))  |>
  dplyr::glimpse()
```

### Select

- Subset variables (columns) you want to keep or exclude by name

Just keep three columns
```{r}
flows |> 
  dplyr::select(Date, Flow)  |> 
  dplyr::glimpse()
```

Exclude just one
```{r}
flows |> 
  dplyr::select(-Flow_cd)  |> 
  dplyr::glimpse()
```

..And can rename while selecting
```{r}
flows |> 
  dplyr::select(Date, flow_cfs = Flow)  |> 
  dplyr::glimpse()
```

### Mutate

- `mutate()` defines and inserts new variables into a existing `data.frame`
- `mutate()` builds new variables sequentially so you can reference earlier ones when defining later ones

We can extract Year and Month as new variables from the Date variable using date time
```{r}
flows  |> 
  dplyr::select(Date, Flow)  |>  
  dplyr::filter(Date > as.Date('2010-01-01'))  |> 
  dplyr::mutate(Year  = format(Date, "%Y"), Month = format(Date, "%m"))  |>  
  dplyr::glimpse()
```

### Summarize and Group_By

- `summarize()` allows you to summarize across all observations
-  `group_by()` allows you to apply any of these manipulation verbs by group in your data

```{r}
flows |> 
 dplyr::select(Date, Flow) |> 
 dplyr::mutate(Year  = format(Date, "%Y"))  |> 
 dplyr::group_by(Year) |> 
 dplyr::summarize(meanQ = mean(Flow),
           maxQ = max(Flow))
```

:::{.callout-note}
You may click on any of the functions in this book to be directed to their respective documentation. For example, clicking on `st_join()` takes you to the documentation page for the `st_join()` function on the `sf` website.
:::

## Why R for Spatial Data Analysis?

**Advantages**

- R is:
  + lightweight
  + open-source
  + cross-platform
- Works with contributed packages - currently `r nrow(utils::available.packages(repos = "http://cran.us.r-project.org"))`
  + provides *extensibility*
- Automation and recording of workflow 
 + provides *reproducibility*
- Optimized work flow - data manipulation, analysis and visualization all in one place
  + provides *integration*
- R does not alter underlying data - manipulation and visualization *in memory*
- R is great for repetitive graphics
- R is great for integrating different aspects of analysis - spatial and statistical analysis in one environment
  + again, *integration*
- Leverage statistical power of R (i.e.  modeling spatial data, data visualization, statistical exploration)
- Can handle vector and raster data, as well as work with spatial databases and pretty much any data format spatial data comes in
- R's GIS capabilities growing rapidly right now - new packages added monthly - currently about 275 spatial packages (depending on how you categorize)

**Drawbacks** for usig R for GIS work

- R is not as good for interactive use as desktop GIS applications like ArcGIS or QGIS (i.e. editing features, panning, zooming, and analysis on selected subsets of features)
- Explicit coordinate system handling by the user
  + no on-the-fly projection support
- In memory analysis does not scale well with large GIS vector and tabular data
- Steep learning curve
- Up to you to find packages to do what you need - help not always great

### A Couple Motivating Examples

Here's a quick example demonstrating R’s flexibility and current strength for plotting and interactive mapping with very simple, expressive code:

```{r}
#| warning: false
# find my town!
my_town <-tmaptools::geocode_OSM("Corvallis OR USA", as.sf = TRUE)
glimpse(my_town)
```

::: {.callout-note appearance="simple" icon="false"}
### Exercise
Run and examine code chunk above and try geocoding examples you think of

1. What does `::` do?
2. What does `geocode_OSM()` do?
3. Explain how the code runs together using the  `|>` chaining operator
4. What is `glimpse()`?  How is it useful compared to `head()`?
:::

::: {.callout-note appearance="simple" icon="false" collapse="true"}
#### Solution
1. `::` specifies using `geocode_OSM` from the `tmaptools` package. R gives namespace preference to packages in order loaded. Some packages share function names, so when there is ambiguity, it's good practice to disambiguate your functions with the double-colon
2. `geocode_OSM()` looks up a named features in OpenStreetMap and returns the coordinates (bonus - we'll delve into more in next section - what coordinate reference system are coordinates in and how do you find out?)
3. You would translate code using the ` |> ` operator from:
  - do this `|>` do that `|>`  do that
  
  To
  
  - do this *then* do that *then* do that
4. Technically, it's a transposed version of `print()` - columns run down page, data across like rows - it additionally gives you the number of observations and variables, and the data type of each column.  Note that `glimpse()` omits geometry information - you would type the name of your object at the console to get full information, or use `print.sf()` or `str()`.
  - bonus: how would you quickly learn more about `glimpse()` from the console?
:::

#### Choropleth map

The `tigris` package can be used to get census boundaries such as states and counties as vector `sf` objects in R
```{r tmap example}
#| label: fig-tmapexample
#| fig-cap: "Area of Oregon Counties"
#| fig-width: 8
#| results: hide
#| message: false
#| warning: false
counties <- counties("Oregon", cb = TRUE)
counties$area <- as.numeric(st_area(counties))
glimpse(counties)

tm_shape(counties) +
  tm_polygons("area", style="quantile", title="Area of Oregon Counties")
```

#### Interactive mapping

```{r}
#| label: fig-mapviewexample
#| fig-width: 8
#| results: hide
#| message: false
mapview(my_town, col="red", col.regions = "red") +
  mapview(counties, alpha.regions = .1)
```

## Spatial Data Structures in R

First we'll walk through spatial data structures in R and review some key components of data structures in R that inform our use of spatial data in R.

:::{.callout-note}
A few core libraries underpin spatial libraries in R (and Python!) and in GIS software applications such as QGIS and ArcPro. Spatial data structures across languages and applications are primarily organized through [OSgeo](https://www.osgeo.org) and [OGC](https://www.ogc.org)). These core libraries include:

- [GDAL](https://gdal.org/) --\> For raster and feature abstraction and processing
- [PROJ](https://proj.org/) --\> A library for coordinate transformations and projections
- [GEOS](https://libgeos.org/) --\> A Planar geometry engine for operations (measures, relations) such as calculating buffers and centroids on data with a projected CRS
- [S2](https://s2geometry.io/) --\>  a spherical geometry engine written in C++ developed by Google and adapted in R with the [**s2**](https://r-spatial.github.io/s2/) package
:::

We'll see how these core libraries are called and used in the R packages we explore throughout this workshop. 

:::{.callout-note}
-   **Vector** data are comprised of points, lines, and polygons that represent discrete spatial entities, such as a river, watershed, or stream gauge.

-   **Raster** data divides spaces into rectilinear cells (pixels) to represent spatially continuous phenomena, such as elevation or the weather. The cell size (or resolution) defines the fidelity of the data.
:::

```{r, fig.align='center', echo = FALSE, out.width="100%"}
knitr::include_graphics("img/09-vec-raster.jpg")
```

### Vector Data Model

For *Vector* data, Simple Features (officially Simple Feature Access) is both an OGC and International Organization for Standardization (ISO) standard that specifies how (mostly) two-dimensional geometries can represent and describe objects in the real world. The Simple Features specification includes:

-   a class hierarchy
-   a set of operations
-   binary and text encodings

It describes how such objects can be stored in and retrieved from databases, and which geometrical operations should be defined for them.

It outlines how the spatial elements of POINTS (XY locations with a specific coordinate reference system) extend to LINES, POLYGONS and GEOMETRYCOLLECTION(s).

The "simple" adjective also refers to the fact that the line or polygon geometries are represented by sequences of points connected with straight lines that do not self-intersect.

```{r}
#| label: fig-sfmodel
#| fig-cap: "The simple features data model."
#| fig-align: center
#| echo: false
#| out-width: "100%"

knitr::include_graphics("img/09-sf-model.png")
```

Here we'll load the `sf` library to load the classes and functions for working with spatial vector data and to explore how `sf` handles vector spatial data in R:

```{r}
#| warning: true
library(sf)          
```

::: {.callout-note appearance="simple" icon="false"}
#### Question
What is the message you see in your console the first time you load `sf`?
Hint: If you don't see a message, you probably already have the `sf` library attached - you can uncheck it in your packages pane in RStudio, or you can run `detach("package:sf", unload = TRUE)`
:::

::: {.callout-note appearance="simple" icon="false" collapse="true"}
#### Answer
The output from `library(sf)` reports which versions of key geographic libraries such as GEOS the package is using
:::

You can report on versions of external software used by `sf` with:
```{r}
sf::sf_extSoftVersion()
```

We'll cover highlights of the functionality of `sf` but we encourage you to read the `sf` package website documentation [r-spatial.github.io/sf](r-spatial.github.io/sf) which can be viewed offline in RStudio using:
```{r}
#| eval: false
vignette(package = "sf") # see which vignettes are available
vignette("sf1")          # open the introduction vignette
```

The `sfg` class in `sf` represents the different simple feature geometry types in R: point, linestring, polygon (and ‘multi’ equivalents such as multipoints) or geometry collection. Vector data in R and `sf` can be broken down into three basic `geometric` primitives with equivalent `sf` functions to create them:

1. points - `st_points()`
2. linestrings - `st_linestring()`
3. polygons - `st_polygon()`

We buld up the more complex simple feature geometries with:
- `st_multipoint()` for multipoints
- `st_multilinestring()` for multilinestrings
- `st_multipolygon()` for multipolygons
- `st_geometrycollection()` for geometry collections

Typically we don't build these up from scratch ourselves but it's important to understand these building blocks since primitives are the building blocks for all vector features in `sf`

::: {.callout-note appearance="simple" icon="true" collapse="true"}
- All functions and methods in `sf` that operate on spatial data are prefixed by `st_`, which refers to spatial type.
- This is similar to [PostGIS](http://postgis.net/)
- This makes them `sf` functions easily discoverable by command-line completion. 
- Simple features are also native R data, using simple data structures (S3 classes, lists, matrix, vector). 
:::

These primitives can all be broken down into set(s) of numeric x,y coordinates with a known coordinate reference system (crs).

#### Points
- A point in `sf` is composed of one coordinate pair (XY) in a specific coordinate system.
- A POINT has no length, no area and a dimension of 0.
- **z** and **m** coordinates As well as having the necessary X and Y coordinates, single point (vertex) simple features can have:

-   a Z coordinate, denoting altitude, and/or
-   an M value, denoting some "measure"

```{r}
# create sf object as point
pt1 <- st_point(c(1,3))
pt2 <- st_point(c(2,1))
pt3 <- st_point(c(3,2))
sf_point <- st_sfc(pt1, pt2, pt3)

# POINT defined as numeric vector
(sf::st_dimension(sf_point))
```

```{r}
ggplot(sf_point) + 
  geom_sf(color = "red") +
  labs(x = "X", y = "Y")  
```

#### Lines
- A polyline is composed of an ordered sequence of two or more POINTs

- Points in a line are called vertices/nodes and explicitly define the connection between two points.

- A LINESTRING has a length, has no area and has a dimension of 1 (length)

```{r}
# create sf object as linestring
line1 <- sf::st_linestring(rbind(pt1, pt2))
line2 <- sf::st_linestring(rbind(pt2, pt3))
sf_linestring <- st_sfc(line1, line2)

# dimension
(sf::st_dimension(sf_linestring))
```

```{r}
ggplot(sf_linestring) + 
  geom_sf() +
  labs(x = "X", y = "Y")
```

#### Polygon
- A POLYGON is composed of 4 or more points whose starting and ending point are the same.

- A POLYGON is a surface stored as a list of its exterior and interior rings.

- A POLYGON has length, area, and a dimension of 2. (area)

```{r}
pt4 <- pt1
poly1 <- st_polygon(list(rbind(pt1, pt2, pt3, pt4)))
sf_polygon <- st_sfc(poly1)

# dimension
sf::st_dimension(sf_polygon)
```

```{r}
ggplot(sf_polygon) + 
  geom_sf(fill = "grey")
```

These geometries as we created above are structure using the `WKT` format.  `sf uses `WKT` and `WKB` formats to store vector geometry.

#### Points, Lines, Polygons

We visualize all geometries together by running
```{r}
ggplot() +
  geom_sf(data = sf_point, color = "red") +
  geom_sf(data = sf_linestring) +
  geom_sf(data = sf_polygon, fill = "grey")
```

::: {.callout-note}
`WKT` (Well-Known Text) and `WKB`

-Well-known text (`WKT`) encoding provide a human-readable description of the geometry. 
- The well-known binary (`WKB`) encoding is machine-readable, lossless, and faster to work with than text encoding. 

`WKB` is used for all interactions with GDAL and GEOS.

:::

#### Spatial Data Frames
Building up from basic points, lines and strings, simple feature objects are stored in a `data.frame`, with geometry data in a special list-column, usually named 'geom' or 'geometry' - this is the same convention used in the popular Python package [GeoPandas](https://r.geocompx.org/geopandas/geopandas/issues/2098).

Let's look at an example spatial data set that comes in the `spData` package, `us_states`:
```{r}
data("us_states")
class(us_states)
names(us_states)
```

We see that it's an `sf` data frame with both attribute columns and the special geometry column we mentioned earlier - in this case named `geometry`.

`us_states$geometry` is the ['list column'](https://adv-r.hadley.nz/vectors-chap.html#list-columns) that contains the all the coordinate information for the US state polygons in this dataset.

To reiterate, `us_states` is an `sf` simple feature collection which we can verify with:
```{r}
us_states
```

We see information on

- the geometry type
- the dimensionality
- the bounding box
- the coordinate reference system
- the header of the attributes

Remember that `sf` simple feature collections are composed of:

```{r, fig.align='center', echo = FALSE, out.width="100%",fig.cap="The simple features data sructure."}
knitr::include_graphics("img/sf_structure.png")
```

- rows of simple features (sf) that contain attributes and geometry - in green
- each of which have a simple feature geometry list-column (sfc) -  in red
- which contains the underlying simple feature geometry (sfg) for each simple feature - in blue

`sf` extends the generic `plot` function so that `plot` will naturally work with `sf` objects.  

```{r}
plot(us_states)
```

:::{.callout-note}
#### Exercise
Is this the result you expected from `plot()`?

Instead of making a single default map of your geometry, `plot()` in `sf` maps each variable in the dataset.

How would you plot just the geometry?  Or just the two population variables?  Try these variations on `plot()` with `us_states`. 

Read more about plotting in `sf` by running `?plot.sf()`
:::

:::{.callout-note collapse="true"}
#### Solution
```{r}
#| eval: false
plot(us_states$geometry)
plot(us_states[,c('total_pop_10','total_pop_15')])
#or
us_states |>
  dplyr::select(total_pop_10, total_pop_15) |>
  plot() 
```
:::

The alternative solution above highlights a **key concept** with `sf` - `sf` objects are 'tidy-compliant' and **geometry is sticky** - it carries along with objects implicitly when performing any tidy chained operations. 

```{r, fig.align='center', echo = FALSE, out.width="100%",fig.cap="The simple features data data sructure." }
knitr::include_graphics("img/Sticky.PNG")
```


Keeping the geometry as part of regular `data.frame` objects in R (and in Python with `GeoPandas`!) has numerous advantages - for instance:
```{r}
summary(us_states['total_pop_15'])
```

The geometry is **sticky**, which means that it is carried along by default!

You can read `sf` objects from shapefiles using `st_read()` or `read_sf()`.

### Raster Data Model
Raster data can be *continuous* (e.g. elevation, precipitation, atmospheric deposition) or it can be *categorical* (e.g. land use, soil type, geology type).  Raster data can also be image based rasters which can be single-band or multi-band.  You can also have a temporal component with space-time cubes.

```{r, fig.align='center', echo = FALSE, out.width="100%",fig.cap="The raster data model."}
knitr::include_graphics("img/Rasters4.png")
```

Support for gridded data in R in recent year has been best implemented with the `raster` package by Robert Hijmans. The `raster` package allowed you to:

* read and write almost any commonly used raster data format
* perform typical raster processing operations such as resampling, projecting, filtering, raster math, etc.
* work with files on disk that are too big to read into memory in R
* run operations quickly since the package relies on back-end C code 

The [terra](https://github.com/rspatial/terra) package is the replacement for the `raster` package and has now superceeded it and we will largely focus on `terra` here. Examples here draw from both [Spatial Data Science with R and terra](https://rspatial.org/terra/index.html) and [An introduction to terra in Geocomputation with R](https://geocompr.robinlovelace.net/spatial-class.html#raster-data).  Use help("terra-package") in the console for full list of available `terra` functions and comparison to / changes from `raster`.

Raster representation is currently in flux a bit in R now with three choices of packages - `raster` and now `terra` which we've mentioned, as well as [stars](https://r-spatial.github.io/stars/index.html) (spatiotemporal tidy arrays with R).

To familiarize ourselves with the `terra` package, let's create an empty `SpatRaster` object - in order to do this we have to:

* define the matrix (rows and columns) 
* define the spatial bounding box

:::{.callout-note}
#### Exercise
Note that typically we would be reading raster data in from a file rather than creating a raster from scratch. 

Run the code steps below to familiarize yourself with constructing a `RasterLayer` from scratch.

Try providing a different bounding box for an area of your choosing or changing the dimensions and view result (by simply typing `myrast` in the console).

```{r raster-layer, message=FALSE, warning=FALSE, error=FALSE}
myrast <- rast(ncol = 10, nrow = 10, xmax = -116, xmin = -126, ymin = 42, ymax = 46)
myrast
```
:::

:::{.callout-note collapse="true"}
#### Solution
```{r}
#| eval: false
# just an example:
myrast <- rast(ncol=40, nrow = 40, xmax=-120, xmin=-136, ymin=52, ymax=56)
```
:::

`terra` uses an `S4` slot structure with a `SpatRaster` object
```{r raster-layer2, message=FALSE, warning=FALSE, error=FALSE}
str(myrast)
isS4(myrast)
```

`terra` has dedicated functions addressing each of the following components: 
  - `dim(my_rast)` returns the number of rows, columns and layers
  - `ncell()` returns the number of cells (pixels)
  - `res()` returns the spatial resolution
  - `ext()` returns spatial extent
  - `crs()` returns the coordinate reference system

:::{.callout-note}
#### Exercise
Exploring raster objects 

1. what is the minimal data required to define a `SpatRaster`? 
2. What is the CRS of our `SpatRaster`? 
3. How do we pull out just the CRS for our `SpatRaster`? 
4. Building on this, what is the code to pull out just our xmin value in our extent, or bounding box?
:::

:::{.callout-note collapse="true"}
#### Solution
1. number columns, number rows, and extent - `terra` will fill in defaults if values aren't provided
```{r raster-emptyexample, message=FALSE, warning=FALSE, error=FALSE}
t <- rast()
t
```

2. We didn't provide one - `terra` uses default `crs` of WGS84 if you don't provide a `crs`

3. 
```{r message=FALSE, warning=FALSE, error=FALSE}
crs(myrast)
```

4. 
```{r message=FALSE, warning=FALSE, error=FALSE}
ext(myrast)
# just grab xmin
ext(myrast)[1]
```
:::

#### Manipulating `terra` objects
So far we've just constructed a raster container with no values (try plotting what we have so far) - let's provide values to the cells using the `runif` function to derive random values from the uniform distribution:
```{r runif, message=FALSE, warning=FALSE, error=FALSE}
#show we have no values
hasValues(myrast)
values(myrast) <- runif(n=ncell(myrast))
myrast
```

An important point to make here is that objects in the `terra` package (and previously in `raster`) can be either in memory or on disk - note the value for our `spatRaster` r of 'source'.  If this were a large raster on disk, the value would be the path to the file on disk.

```{r inmemory, message=FALSE, warning=FALSE, error=FALSE}
sources(myrast) # "" means the raster is in memory; otherwise a path to a file
hasValues(myrast)
nlyr(myrast) # we just have one layer in our object
```

`terra` also provides plot method for it's classes:
```{r plot raster, message=FALSE, warning=FALSE, error=FALSE}
plot(myrast)
```

We can also overwrite the cell values for our raster:
```{r new_values_raster, message=FALSE, warning=FALSE, error=FALSE}
values(myrast) <- 1:ncell(myrast)
values(myrast)[1:15]
```

You can access raster values via direct indexing or line, column indexing - take a minute to see how this works using raster r we just created - the syntax is:
```r
myrast[i]
myrast[line, column]
```

:::{.callout-note}
#### Exercise
How is `terra` data storage unlike a `matrix` in R?  Try creating a `matrix` with same dimensions and values and compare with the `terra` raster you've created:
:::

:::{.callout-note collapse="true"}
#### Solution
```{r matrix_raster, message=FALSE, warning=FALSE, error=FALSE}
m <- matrix (1:100, nrow=10, ncol=10)
m[1,10]
myrast[1,10]
myrast[10]
```
:::

#### Reading existing rasters on disk
```{r single_raster, message=FALSE, warning=FALSE, error=FALSE}
raster_filepath = system.file("raster/srtm.tif", package = "spDataLarge")
my_rast = rast(raster_filepath)
nlyr(my_rast)
plot(my_rast)
```

### Multiband rasters
The `spatRaster` object in `terra` can hold multiple layers (similar to `RasterBrick` and `RasterStack` which were two additional classes in the `raster` package).  These layers correspond to multispectral satellite imagery or a time-series raster.

```{r multiband, message=FALSE, warning=FALSE, error=FALSE}
landsat = system.file("raster/landsat.tif", package = "spDataLarge")
landsat = rast(landsat)
landsat
plot(landsat)
```

#### Raster data with `stars`
`stars` works with and stands for spatio-temporal arrays and can deal with more complex data types than either `raster` or `terra` such as rotated grids.

`stars` integrates with `sf` and many sf functions have methods for `stars` objects (i.e. `st_bbox` and `st_transform`) - this makes sense since they are both written by Edzer Pebesma. `terra` unfortunately has poor / no integration with `sf` - this is a big issue for me personally and I will likely look to `stars` long-term for my raster processing.

Basic example shown in [stars vignette](https://r-spatial.github.io/stars/articles/stars1.html) - reading in the 30m bands of a Landsat-7 image that comes with the `stars` package:
```{r stars_1, message=FALSE, warning=FALSE, error=FALSE}
tif = system.file("tif/L7_ETMs.tif", package = "stars")
ls7 = read_stars(tif)
plot(ls7, axes = TRUE)
```

ls7 (landsat7) is an object with 3 dimensions (x,y and band) and 1 attribute
```{r stars_2, message=FALSE, warning=FALSE, error=FALSE}
ls7
```

Note when we plotted above that the `plot` method with `stars` uses histogram stretching across all bands - we can also do stretching for each band individually:
```{r stars_3, message=FALSE, warning=FALSE, error=FALSE}
plot(ls7, axes = TRUE, join_zlim = FALSE)
```

`stars` uses [tidyverse methods for spatio-temporal arrays](https://r-spatial.github.io/stars/articles/stars3.html) - an example of this is pulling out one band of an image using `slice`
```{r stars_4, message=FALSE, warning=FALSE, error=FALSE}
ls7  |>  dplyr::slice(band, 6) -> band6
band6
```

This gives us a lower-dimensional array of just band 6 from the Landsat7 image.

## Coordinate Reference Systems
A `CRS` is made up of several components:

- **Coordinate system**: The x,y grid that defines where your data lies in space
- **Horizontal and vertical units**: The units describing grid along the x,y and possibly z axes
- **Datum**: The modeled version of the shape of the earth
- **Projection details**: If projected, the mathematical equation used to flatten objects from round surface (earth) to flat surface (paper or screen)

```{r, fig.cap="Source: https://mgimond.github.io/Spatial/chp09_0.html", echo=FALSE, out.width="75%"}
knitr::include_graphics("img/Ellipsoid.png")
```

### The ellipsoid and geoid
The earth is a sphere, but more precisely, an *ellipsoid*, which is defined by two radii: 
- semi-major axis (equatorial radius)
- semi-minor axis (polar radius)

The terms *speroid* and *ellipsoid* are used interchangeably.  One particular spheroid is distinguished from another by the lengths of the semimajor and semiminor axes. Both GRS80 and WGS84 spheroids use
semimajor and semiminor axes of 6,378,137 meters and 6,356,752 meters respectively (the semiminor axis differs by thousandths of meters between the two).  You'll encounter older spheroid / ellipsoids out there such as Clark 1866.

More precisely than an *ellipsoid*, though, we know that earth is a *geoid* - it is not perfectly smooth - and modelling the the undulations due to changes in gravitational pull for different locations is crucial to accuracy in a GIS. This is where a *datum* comes in - a datum is built on top of the selected spheroid and can incorporate local variations in elevation. We have many datums to choose from based on location of interest - in the US we would typically choose *NAD83*

### Why you need to know about CRS working with spatial data in R:
```{r crs_fail, message=FALSE, warning=FALSE, error=FALSE}
data(pnw)

gages = read_csv(system.file("extdata/Gages_flowdata.csv", package = "Rspatialworkshop"),show_col_types = FALSE)

gages_sf <- gages  |> 
  st_as_sf(coords = c("LON_SITE", "LAT_SITE"), crs = 4269, remove = FALSE)  |> 
  dplyr::select(STATION_NM,LON_SITE, LAT_SITE)

# Awesome, let's plot our gage data and state boundaries!
plot(pnw$geometry, axes=TRUE)
plot(gages_sf$geometry, col='red', add=TRUE)
# um, what?
```

There is no 'on-the-fly' projection in R - you need to make sure you specify the CRS of your objects, and CRS needs to match for any spatial operations or you'll get an error

- [spatialreference.org](https://spatialreference.org/) is your friend in R - chances are you will use it frequently working with spatial data in R.

- [projection Wizard](https://projectionwizard.org/) is also really useful, as is [epsg.io](https://epsg.io/)

### Defining your coordinate reference system
A coordinate reference system for spatial data can be defined in different ways - for instance for the standard WGS84 coordinate system you could use:

- Proj4
  + ```+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs```
- OGC WKT
  + ```GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.01745329251994328,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]]```
- ESRI WKT
  + ```GEOGCS["GCS_WGS_1984",DATUM["D_WGS_1984",SPHEROID["WGS_1984",6378137,298.257223563]],PRIMEM["Greenwich",0],UNIT["Degree",0.017453292519943295]]```
- 'authority:code' identifier
  + `EPSG:4326`

Proj4 used to be the standard crs identifier in R but the preferable way is to use the AUTHORITY:CODE method which `sf` as well as software like [QGIS](https://docs.qgis.org/3.16/en/docs/user_manual/working_with_projections/working_with_projections.html) will recognize. The AUTHORITY:CODE method is durable and easily discoverable online.

The `WKT` representation of EPSG:4326 in `sf` is:
```{r}
sf::st_crs('EPSG:4326')
```

### Projected coordinate systems
Typically we want to work with data that is projected. Projected coordinate systems (which are based on Cartesian coordinates) have: an origin, an x axis, a y axis, and a linear unit of measure. Going from geographic coordinates to a projected coordinate reference systems requires mathematical transformations. 

Four spatial properties of projected coordinate systems that are subject to distortion are: *shape*, *area*, *distance* and *direction*. A map that preserves shape is called *conformal*; one that preserves area is called *equal-area*; one that preserves distance is called *equidistant*; and one that preserves direction is called *azimuthal* (from [https://mgimond.github.io/Spatial/chp09_0.html](https://mgimond.github.io/Spatial/chp09_0.html).


The takeaway from all this is you need to be aware of the `crs` for your objects in R, make sure they are projected if appropriate and in a projection that optimizes properties you are interested in, and objects you are analyzing or mapping together need to be in same `crs`.  

Going back to our original example, we can transform `crs` of objects to work with them together:

```{r crs_fix, message=FALSE, warning=FALSE, error=FALSE}
# Check our coordinate reference systems
st_crs(gages_sf)
st_crs(pnw)
# Are they equal?
st_crs(gages_sf)==st_crs(pnw)
# transform one to the other
gages_sf <- st_transform(gages_sf, st_crs(pnw))
ggplot() + 
  geom_sf(data = gages_sf,  color="blue") +
  geom_sf(data = pnw,  color="black", fill=NA) +
  labs(title="USGS Stream Gages in the Pacific Northwest") +
  theme_bw() 
```

:::{.callout-note}
#### Exercise
Re-project both the PNW state polygons and the stream gages to a different projected CRS - I suggest UTM zone 11, or perhaps Oregon Lambert, but choose your own.  Use resources I list above to find a projection and get it's specification to use in R to re-project both `sf` objects, then plot together either in base R or ggplot.
:::

:::{.callout-note collapse="true"}
#### Solution
My solution:
```{r exercise, message=FALSE, warning=FALSE, error=FALSE}
# first I save the previous plot so we can view it alongside our update:
p1 <-ggplot() + 
  geom_sf(data=gages_sf,  color="blue") +
  geom_sf(data=pnw,  color="black", fill=NA) +
  labs(title="Albers Equal Area") +
  theme_bw()
# You can fully specify the WKT:
utmz11 <- 'PROJCS["NAD83(CSRS98) / UTM zone 11N (deprecated)",GEOGCS["NAD83(CSRS98)",DATUM["NAD83_Canadian_Spatial_Reference_System",SPHEROID["GRS 1980",6378137,298.257222101,AUTHORITY["EPSG","7019"]],TOWGS84[0,0,0,0,0,0,0],AUTHORITY["EPSG","6140"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9108"]],AUTHORITY["EPSG","4140"]],UNIT["metre",1,AUTHORITY["EPSG","9001"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-117],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",0],AUTHORITY["EPSG","2153"],AXIS["Easting",EAST],AXIS["Northing",NORTH]]'

# or you can simply provide the EPSG code:
utmz11 <- 2153
gages_sf <- st_transform(gages_sf, utmz11)
pnw <- st_transform(pnw, utmz11)
p2 <- ggplot() + 
  geom_sf(data=gages_sf,  color="blue") +
  geom_sf(data=pnw,  color="black", fill=NA) +
  labs(title="UTM Zone 11") +
  theme_bw() 
plot_grid(p1, p2)
```
:::

### Projecting
For spatial operations in R, raster or vector, we always need to make sure objects are in an appropriate `crs` and in the same `crs` as one another.

We'll use the elevation data and a polygon feature for Crater Lake from the `Rspatialworkshop` package for this example.
```{r projection, message=FALSE, warning=FALSE, error=FALSE}
data(CraterLake)
raster_filepath <- system.file("extdata", "elevation.tif", package = "Rspatialworkshop")
elevation <- terra::rast(raster_filepath)
```

:::{.callout-note}
#### Exercise
1. Find the projection of both the Crater Lake spatial polygon feature and the elevation raster
2. Are they the same? Should we project one to the other, or apply a new projection?

:::{.callout-note collapse="true"}
#### Solution
1. Check the CRS of each feature and test for equality
```{r projection2, message=FALSE, warning=FALSE, error=FALSE}
crs(elevation)
st_crs(CraterLake)
st_crs(CraterLake) == crs(elevation)
st_crs(CraterLake)$wkt == crs(elevation)
```
2.  They share the same EPSG code, but are parameterized slightly differently for each - the `crs` function in `terra` does not include the `input` list item that `st_crs` in `sf` does - they are the same though as demonstrated when specifying the `wkt` item only from `st_crs(CraterLake)`
3. Each feature is in WGS84, an unprojected CRS - for most operations, we would prefer to have them in a projected CRS

:::{.callout-note}
#### Exercise
Find an appropriate area-preserving projection using [Projection Wizard](https://projectionwizard.org/#) or [spatialreference.org](https://spatialreference.org/) or any means you prefer and project both Crater Lake and elevation to this CRS.
:::

:::{.callout-note collapse="true"}
#### Solution
Here I'll apply a custom Transverse Mercator I found the WKT representation for using Projection Wizard.  Another good projection would be [UTM zone 10N](https://spatialreference.org/ref/epsg/3157/) or [UTM zone 11N](https://spatialreference.org/ref/epsg/2153/)
```{r projection3, message=FALSE, warning=FALSE, error=FALSE}
tranvmerc <- 'PROJCS["ProjWiz_Custom_Transverse_Mercator", GEOGCS["GCS_WGS_1984", DATUM["D_WGS_1984", SPHEROID["WGS_1984",6378137.0,298.257223563]], PRIMEM["Greenwich",0.0], UNIT["Degree",0.0174532925199433]],PROJECTION["Transverse_Mercator"],PARAMETER["False_Easting",500000],PARAMETER["False_Northing",0.0],PARAMETER["Central_Meridian",-483.2226562],PARAMETER["Scale_Factor",0.9996],PARAMETER["Latitude_Of_Origin",0.0],UNIT["Meter",1.0]]'
elevation_tm <- project(elevation, tranvmerc, method = "bilinear")
crs(elevation_tm)
CraterLake_tm <- st_transform(CraterLake,tranvmerc)
```
:::

### `S2` class
It's important to note a recent change in R spatial with spherical geometry. Since `sf` version 1.0.0, R supports spherical geometry operations by default via its interface to Google’s `S2` spherical geometry engine and the `s2` interface package. `S2` is perhaps best known as an example of a Discrete Global Grid System (DGGS). Another example of this is the [H3](https://eng.uber.com/h3/) global hexagonal hierarchical spatial index.

`sf` can run with `s2` on or off and by default the S2 geometry engine is turned on:
```{r}
sf::sf_use_s2()
```

See the [Geocomputation with R s2 section for further details](https://r.geocompx.org/spatial-class#s2)

:::{.callout-note}
There are sometimes good reasons for turning `S2` off during an R session. See issue 1771 in sf’s GitHub repo -  the default behavior can make code that would work with S2 turned off (and with older versions of sf) fail.

Situations that can cause problems include operations on polygons that are not valid according to S2’s stricter definition. If you see error message such as `#> Error in s2_geography_from_wkb ...` you may want to turn `s2` off and try your operation again, perhaps turning of `s2` and running a topology clean operation like `st_make_valid()`.
:::

## Geographic Data Input/Output (I/O)
- There are several ways we typically get spatial data into R:
  + Load spatial files we have on our machine or from remote source
  + Load spatial data that is part of an R package
  + Grab data using API (often making use of particular R packages)
  + Converting flat files with x,y data to spatial data
  + Geocoding data (we saw example of this at beginning)


For reading and writing vector and raster data in R, the several main packages we'll look at are:

 * `sf`for vector formats such as [ESRI Shapefiles](https://en.wikipedia.org/wiki/Shapefile),  [GeoJSON](https://en.wikipedia.org/wiki/GeoJSON), and [GPX](https://en.wikipedia.org/wiki/GPS_Exchange_Format) - `sf` uses **OGR**, which is a library under the [GDAL](https://en.wikipedia.org/wiki/GDAL) source tree,under the hood
 * `terra` or `stars` for raster formats such as [GeoTIFF](https://en.wikipedia.org/wiki/GeoTIFF) or [ESRI or ASCII grid](https://en.wikipedia.org/wiki/Esri_grid) using [GDAL](https://en.wikipedia.org/wiki/GDAL) under the hood
 
We can quickly discover supported I/O vector formats either via `sf` or `rgdal`:
```{r message=FALSE, warning=FALSE, error=FALSE}
print(paste0('There are ',st_drivers("vector")  |>  nrow(), ' vector drivers available using st_read or read_sf'))
kable(head(st_drivers(what='vector'),n=5))
```

As well as I/O raster formats via `sf`:
```{r message=FALSE, warning=FALSE, error=FALSE}
print(paste0('There are ',st_drivers(what='raster')  |>  nrow(), ' raster drivers available'))
kable(head(st_drivers(what='raster'),n=5))
```

### Reading in vector data
`sf` can be used to read numerous file types:

- Shapefiles
- Geodatabases
- Geopackages
- Geojson
- Spatial database files

#### Shapefiles
Typically working with vector GIS data we work with ESRI shapefiles or geodatabases - here we have an example of how one would read in a shapefile using `sf`:
```{r shapefile_read, message=FALSE, warning=FALSE, error=FALSE}
citylims <- read_sf(system.file("extdata/city_limits.shp", package = "Rspatialworkshop"))

options(scipen=3)
plot(citylims$geometry, axes=T, main='Oregon City Limits') # plot it!
```

:::{.callout-note}
#### Exercise
`st_read` versus `read_sf` - throughout this section I've used `read_sf` and `st_read` - I typically use `read_sf` - why would that be a good practice?
:::

:::{.callout-note collapse="true"}
#### Solution 
`read_sf` is an `sf` alternative to st_read (see [this](https://keen-swartz-3146c4.netlify.com/intro.html#read) section 1.2.2). Try reading in `citylims` data above using `read_sf` and notice difference, and check out `help(read_sf)`.  `read_sf` and `write_sf` are simply aliases for st_read and st_write with modified default arguments.  Big differences are:

- `stringsAsFactors = FALSE`
- `quiet = TRUE`
- `as_tibble = TRUE`

Note that `stringsAsFactors = FALSE` is the new default in R versions >= 4.0
:::

#### Geodatabases
We use `st_read` or `read_sf` similarly for reading in an ESRI file geodatabase feature:
```{r geodatabase_read, message=FALSE, warning=FALSE, error=FALSE}
# List all feature classes in a file geodatabase
st_layers(system.file("extdata/StateParkBoundaries.gdb", package = "Rspatialworkshop"))
fgdb = system.file("extdata/StateParkBoundaries.gdb", package = "Rspatialworkshop")
# Read the feature class
parks <- st_read(dsn=fgdb,layer="StateParkBoundaries")
ggplot(parks) + geom_sf()
```

### Geopackages
Another spatial file format is the geopackage. Let's try a quick read and write of geopackage data.  First we'll read in a geopackage using data that comes with `sf` using `dplyr` syntax just to show something a bit different and use `read_sf` as an alternative to `st_read`.  You may want to try writing the data back out as a `geopackage` as well. 


```{r geopackage_read, message=FALSE, warning=FALSE, error=FALSE}
nc <- system.file("gpkg/nc.gpkg", package="sf")  |>  read_sf() # reads in
glimpse(nc)
```

:::{.callout-note}
#### Exercise
What are a couple advantages of `geopackages` over `shapefiles`?
:::

:::{.callout-note collapse="true"}
#### Solution 
Some thoughts [here](https://towardsdatascience.com/why-you-need-to-use-geopackage-files-instead-of-shapefile-or-geojson-7cb24fe56416), main ones probably:

- geopackages avoid mult-file format of shapefiles
- geopackages avoid the 2gb limit of shapefiles
- geopackages are open-source and follow OGC standards
- lighter in file size than shapefiles
- geopackages avoid the 10-character limit to column headers in shapefile attribute tables (stored in archaic .dbf files)
:::

#### Open spatial data sources
There is a wealth of open spatial data accessible online now via static URLs or APIs - a few examples include [Data.gov](https://catalog.data.gov/dataset?metadata_type=geospatial), NASA [SECAC Portal](http://sedac.ciesin.columbia.edu/), [Natural Earth](http://www.naturalearthdata.com/), [UNEP GEOdata](http://geodata.grid.unep.ch/), and countless others listed here at [Free GIS Data](https://freegisdata.rtwilson.com/)

#### Spatial data from R packages
There are also a number of R packages written specifically to provide access to geospatial data - below are a few and we'll step through some examples of pulling in data using some of these packages.
```{r datapackages, echo=FALSE}
datapackages = tibble::tribble(
  ~`Package name`, ~Description,
  "USABoundaries", "Provide historic and contemporary boundaries of the US",
  "tigris", "Download and use US Census TIGER/Line Shapefiles in R",
  "tidycensus", "Uses Census American Community API to return tidyverse and optionally sf ready data frames",
  "FedData","Functions for downloading geospatial data from several federal sources",
  "elevatr", "Access elevation data from various APIs (by Jeff Hollister)",
  "getlandsat", "Provides access to Landsat 8 data.",
  "osmdata", "Download and import of OpenStreetMap data.",
  "raster", "The `getData()` function downloads and imports administrative country, SRTM/ASTER elevation, WorldClim data.",
  "rnaturalearth", "Functions to download Natural Earth vector and raster data, including world country borders.",
  "rnoaa", "An R interface to National Oceanic and Atmospheric Administration (NOAA) climate data.",
  "rWBclimate", "An access to the World Bank climate data."
)
knitr::kable(datapackages, caption = "Example R packages for spatial data retrieval.")
```

Below is an example of pulling in US states using the `rnaturalearth` package - note that the default is to pull in data as `sp` objects and we coerce to `sf`.  Also take a look at the chained operation using dplyr.  Try changing the filter or a parameter in ggplot.
```{r naturalearth, , message=FALSE, warning=FALSE, error=FALSE}
states <- ne_states(country = 'United States of America')
states_sf <- st_as_sf(states)
states_sf  |>  
  dplyr::filter(!name %in% c('Hawaii','Alaska') & !is.na(name)) |> 
  ggplot() + geom_sf(fill=NA)
```

#### Read in OpenStreetMap data
The [osmdata](https://github.com/ropensci/osmdata) package is a fantastic resource for leveraging the OpenStreetMap (OSM) database.

First we'll find available tags to get foot paths to plot
```{r osmdata1 , message=FALSE, warning=FALSE, error=FALSE}
head(available_tags("highway")) # get rid of head when you run - just used to truncate output
```

```{r osmdata2 , message=FALSE, warning=FALSE, error=FALSE}
footway <- opq(bbox = "corvallis oregon") %>% 
  add_osm_feature(key = "highway", value = c("footway","cycleway","path", "path","pedestrian","track")) %>% 
  osmdata_sf()
footway <- footway$osm_lines

rstrnts <- opq(bbox = "corvallis oregon") %>% 
    add_osm_feature(key = "amenity", value = "restaurant") %>%
    osmdata_sf()
rstrnts <- rstrnts$osm_points

mapview(footway$geometry) + mapview(rstrnts)
```

:::{.callout-note}
#### Exercise
Take a minute and try pulling in data of your own for your own area and plotting using `osmdata`
:::

### Reading in Raster data

### `terra` package
Load stock elevation .tif file that comes with package
```{r terra_hist, message=FALSE, warning=FALSE, error=FALSE}
f <- system.file("ex/elev.tif", package="terra")
elev <- rast(f)
barplot(elev, digits=-1, las=2, ylab="Frequency")
```

```{r terra_plot, message=FALSE, warning=FALSE, error=FALSE}
plot(elev)
```

### `stars` package
Load stock Landsat 7 .tif file that comes with package
```{r stars_read, message=FALSE, warning=FALSE, error=FALSE}
tif = system.file("tif/L7_ETMs.tif", package = "stars")
read_stars(tif) |>
  dplyr::slice(index = 1, along = "band") |>
  plot()
```

### Convert flat files to spatial
We often have flat files, locally on our machine or accessed elsewhere, that have coordinate information which we would like to make spatial. 

In the steps below, we 

  1. read in a .csv file of USGS gages in the PNW that have coordinate columns
  2. Use `st_as_sf` function in `sf` to convert the data frame to an `sf` spatial simple feature collection by:
      a. passing the coordinate columns to the coords parameter
      b. specifying a coordinate reference system (CRS)
      c. opting to retain the coordinate columns as attribute columns in the resulting `sf` feature collection.
  3. Keep only the coordinates and station ID in resulting `sf` feature collection, and
  4. Plotting our gages as spatial features with `ggplot2` using `geom_sf`.
  
```{r vector_drivers, message=FALSE, warning=FALSE, error=FALSE}
gages = read.csv(system.file("extdata/Gages_flowdata.csv", package = "Rspatialworkshop"))

gages_sf <- gages |> 
  st_as_sf(coords = c("LON_SITE", "LAT_SITE"), crs = 4269, remove = FALSE)  |> 
  dplyr::select(STATION_NM,LON_SITE, LAT_SITE)

ggplot() + geom_sf(data=gages_sf)
```