-
Notifications
You must be signed in to change notification settings - Fork 5
/
Copy pathslides.qmd
1552 lines (1149 loc) · 36.4 KB
/
slides.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
---
title: "Spatial Data Workshop"
subtitle: "2023 EPA R User Group Workshop"
date: October 18, 2023
format:
revealjs:
author:
- "Marc Weber"
- "Michael Dumelle"
institute:
- "EPA (USA)"
- "EPA (USA)"
footer: "Spatial Data Workshop"
slide-number: true
preview-links: true
transition: fade
theme: [default, slides.scss]
smaller: false
auto-stretch: true
code-link: true
incremental: false
execute:
echo: true
embed-resources: true
bibliography: references.bib
---
```{r}
#| label: setup
#| include: false
# set width of code output
options(width = 80)
# load background packages
library(countdown)
```
## Welcome!
1. Please visit [https://usepa.github.io/spatialdata.epaR2023/](https://usepa.github.io/spatialdata.epaR2023/) to view the workshop's accompanying workbook
2. Install and load R packages by visiting "Setup" in the workbook's "Welcome" section
3. (Optional) Download the workshop slides (instructions in the workbook's "Welcome" section)
4. (Optional) Fill out the poll so we can better understand attendee's level of experience handling spatial data
5. Follow along and have fun!
## Who Are We?
Marc Weber is a geographer at the Pacific Ecological Systems Division (PESD) at the United States Environmental Protection Agency (USEPA). His work supports various aspects of the USEPA's National Aquatic Resource Surveys (NARS), which characterize the condition of waters across the United States, as he helped develop and maintains the StreamCat and LakeCat datasets. His work focuses on spatial analysis in R and Python, Geographic Information Science (GIS), aquatic ecology, remote sensing, open source science and environmental modeling.
## Who Are We?
Michael Dumelle is a statistician for the United States Environmental Protection Agency (USEPA). He works primarily on facilitating the survey design and analysis of USEPA's National Aquatic Resource Surveys (NARS), which characterize the condition of waters across the United States. His primary research interests are in spatial statistics, survey design, environmental and ecological applications, and software development.
## Disclaimer
The views expressed in this workshop are those of the authors and do not necessarily represent the views or policies of the U.S. Environmental Protection Agency or the U.S. National Oceanic and Atmospheric Administration. Any mention of trade names, products, or services does not imply an endorsement by the U.S. government, the U.S. Environmental Protection Agency, or the U.S. National Oceanic and Atmospheric Administration. The U.S. Environmental Protection Agency and the U.S. National Oceanic and Atmospheric Administration do not endorse any commercial products, services, or enterprises.
## What Will We Cover?
* Foundations of spatial data
* Visualizing spatial data
* Geoprocessing spatial data
* Advanced applications
* Focus on the **R** computing language
## Today's Agenda
- 1:00pm - 1:35pm EDT: Introduction and Spatial Data Structures in R
- 1:35pm - 2:05pm EDT: Vector Data Model
- 2:05pm - 2:10pm EDT: Break
- 2:10pm - 2:30pm EDT: Raster Data Model
- 2:30pm - 3:15pm EDT: Coordinate Reference Systems and I / O
- 3:15pm - 3:25pm EDT: Break
## Today's Agenda
- 3:25pm - 3:45pm EDT: Spatial Data Visualization
- 3:45pm - 4:15pm EDT: Geoprocecessing
- 4:15pm - 4:20pm EDT: Break
- 4:20pm - 5:00pm EDT: Advanced Applications
# Foundations
## Goals
::: goals
1. Understand fundamental spatial data structures and libraries in **R**.
2. Become familiar with coordinate reference systems.
3. Geographic I/O (input/output).
:::
## Data Structures
* Review data structures section in workbook
* Cover vectors, matrices, arrays, lists, data frames, etc.
* Cover data frame manipulation using [tidyverse](https://www.tidyverse.org/) functions and the pipe operator (`%>%` or `|>`)
## Why R for Spatial Data Analysis?
* Lightweight, open-source, and cross-platform
* Provides tools for automation and reproducibility
* Seamlessly integrates spatial data analysis and statistical analysis in one environment
* Handles vector and raster data
## R Drawbacks for GIS Work
* R is less interactive than desktop applications like ArcGIS or QGIS
* Handling coordinate reference systems is more challenging
* In-memory analysis can be prohibitive for large data
* Steep learning curve
## A Motivating Example
* Simple, expressive code
```{r}
#| warning: false
library(tmap)
library(tmaptools)
library(dplyr)
# find my town!
my_town <- tmaptools::geocode_OSM("Corvallis OR USA", as.sf = TRUE)
glimpse(my_town)
```
## Your Turn
::: task
1. What does `::` do?
2. What does `geocode_OSM()` do?
3. Explain how the code runs together using the `|>` chaining operator
4. What is `glimpse()`? How is it useful compared to `head()`?
:::
```{r}
#| echo: false
countdown(minutes = 5)
```
## Our Turn
::: task2
1. `::` lets you clarify from which package a function is called (e.g., `tmaptools::geocodeOSM()`)
2. `geocode_OSM()` looks up a named feature in OpenStreetMap and returns the coordinates
3. Do this `|>` then that
4. `glimpse()` glimpses at a data frame
:::
## A Choropleth Map
* A choropleth map is a map that connects statistical summaries to geographic characteristics
* The `tigris` package can retrieve census, county, and state boundaries as vector `sf` data
```{r}
#| label: fig-tmapexample
#| fig-cap: "Distribution of Oregon counties."
#| output-location: slide
#| fig-width: 8
#| results: hide
#| message: false
#| warning: false
library(tigris)
library(sf)
counties <- counties("Oregon", cb = TRUE)
counties$area <- as.numeric(st_area(counties))
glimpse(counties)
tm_shape(counties) +
tm_polygons("area", style="quantile", title="Area of Oregon Counties")
```
## Your Turn
::: task
Create and explore an interactive map of `my_town` by running
```{r}
#| eval: false
library(mapview)
mapviewOptions(fgb=FALSE)
mapview(my_town, col="red", col.regions = "red") +
mapview(counties, alpha.regions = .1)
```
:::
```{r}
#| echo: false
#| message: false
#| warning: false
library(mapview)
countdown(minutes = 4)
```
## Our Turn
```{r}
#| label: fig-mytowncounties
#| fig-cap: "Corvallis among distribution of Oregon counties."
#| fig-align: center
#| out-width: "100%"
#| echo: false
knitr::include_graphics("img/mytown_counties.jpg")
```
## Spatial Data Structures
* Spatial data structures are primarily organized through:
1. GDAL [link here](https://gdal.org/): For raster and feature abstraction and processing
2. PROJ [link here](https://proj.org/): For coordinate transformations and projections
3. GEOS [link here](https://libgeos.org/): For spatial operations (e.g., calculating buffers or centroids) on data with a projected coordinate reference system (CRS)
* We explore these core libraries throughout the workshop
## Types of Spatial Data
* Vector data are comprised of points, lines, and polygons that represent discrete spatial entities (e.g., river, wtaershed, stream gauge)
* Raster data divide space into small rectangles (pixels) that represent spatially continuous phenomena like elevation or precipitation
## Types of Spatial Data
```{r}
#| label: fig-vectorraster
#| fig-cap: "Vector and raster data."
#| fig-align: center
#| out-width: "100%"
#| echo: false
knitr::include_graphics("img/09-vec-raster.jpg")
```
## Vector Data Model
* Vector data are described using simple features
* Simple features is a standard way to specify how two-dimensional geometries are represented, stored in and retrieved from databases, and which geometric operations can be performed
* Provides framework for extending spatial `POINTS` to `LINES` and `POLYGONS`
## Types of Spatial Data
```{r}
#| label: fig-sfmodel
#| fig-cap: "The simple features data model."
#| fig-align: center
#| echo: false
#| out-width: "100%"
knitr::include_graphics("img/09-sf-model.png")
```
## The `sf` Package
* We use the `sf` package to work with spatial vector data in **R**
* Learn more about `sf` at their website [r-spatial.github.io/sf](r-spatial.github.io/sf) and/or by running
```{r}
#| eval: false
vignette(package = "sf") # see which vignettes are available
vignette("sf1") # open the introduction vignette
```
## `sf` Primitives
* There are three basic `geometric` primitives with corresponding `sf` functions to create them:
1. `POINT`: created with `st_points()`
2. `LINESTRING`: created with `st_linestring()`
3. `POLYGON`: created with `st_polygon()`
* More complex simple feature geometries are built up using `st_multipoint()`, `st_multilinestring()`, `st_multipolygon()`, and `st_geometrycollection()`
* Building blocks for more complicated/large data structures
## `POINT`s
* Points in `sf` are composed of coordinate pairs and a coordinate reference system
* Points have no length or area
```{r}
# create sf object as point
pt1 <- st_point(c(1,3))
pt2 <- st_point(c(2,1))
pt3 <- st_point(c(3,2))
sf_point <- st_sfc(pt1, pt2, pt3)
```
```{r}
#| label: fig-sfpoint
#| fig-cap: "sf `POINT`"
#| output-location: slide
library(ggplot2)
ggplot(sf_point) +
geom_sf(color = "red") +
labs(x = "X", y = "Y")
```
## `LINESTRING`s
* An ordered sequence of two or more `POINT`s
* Points in a line are called verticies (or nodes)
* Linestrings have length but no area
```{r}
# create sf object as linestring
line1 <- sf::st_linestring(rbind(pt1, pt2))
line2 <- sf::st_linestring(rbind(pt2, pt3))
sf_linestring <- st_sfc(line1, line2)
```
```{r}
#| label: fig-sflinestring
#| fig-cap: "sf `LINESTRING`"
#| output-location: slide
ggplot(sf_linestring) +
geom_sf() +
labs(x = "X", y = "Y")
```
## `POLYGON`s
* Four or more points with the same starting and ending point
* Has length and area
```{r}
pt4 <- pt1
poly1 <- st_polygon(list(rbind(pt1, pt2, pt3, pt4)))
sf_polygon <- st_sfc(poly1)
# dimension
sf::st_dimension(sf_polygon)
```
```{r}
#| label: fig-sfpolygon
#| fig-cap: "sf `POLYGON`"
#| output-location: slide
ggplot(sf_polygon) +
geom_sf(fill = "grey")
```
## Putting Them All Together
```{r}
#| label: fig-sfallshape
#| fig-cap: "sf `POINT`, `LINESTRING`, and `POLYGON` overlain"
#| output-location: slide
ggplot() +
geom_sf(data = sf_point, color = "red") +
geom_sf(data = sf_linestring) +
geom_sf(data = sf_polygon, fill = "grey")
```
## Spatial Data Frames
* `sf` objects combine `data.frame` attributes with `POINT`, `LINESTRING`, and `POLYGON` building blocks (these are the `geometry`)
* Also characterize dimensionality, bounding box, coordinate reference system, attribute headers
* Read from shapefiles using `st_read()` or `read_sf()`
```{r}
#| warning: false
#| message: false
#| output-location: slide
library(spData)
data("us_states")
print(us_states)
```
## Spatial Data Frames
```{r}
#| label: fig-sfstructure
#| fig-cap: "The simple features data structure."
#| fig-align: center
#| echo: false
#| out-width: "100%"
knitr::include_graphics("img/sf_structure.png")
```
## Your Turn
::: task
So far we have explored plotting using `ggplot()`, but you can also use `sf`'s base plotting via `plot()`. Read more about plotting in `sf` by running `?plot.sf` and then make an appropriate plot of the `us_states` data.
:::
```{r}
#| echo: false
countdown(minutes = 5)
```
## Our Turn
::: task2
Running `plot()` in `sf` plots all variables; common to subset
:::
```{r}
#| label: fig-usarea
#| fig-cap: "Area by state in the US."
#| output-location: slide
plot(us_states["AREA"])
```
## Raster Data Model
* Raster data can be continuous or categorical
* Can be image based; have a temporal component
## Raster Data Model
```{r}
#| label: fig-rastermodel
#| fig-cap: "The raster data model."
#| fig-align: center
#| echo: false
#| out-width: "100%"
knitr::include_graphics("img/raster_data_model.jpg")
```
## The `terra` Package
* We use the `terra` package to work with spatial raster data in **R**
* `terra` is the modern replacement to `raster`
* Learn more about `terra` at their website [https://rspatial.org/](https://rspatial.org/)
* `stars` for spatio-temporal rasters
## Creating a Raster Object
* Create an empty `SpatRaster` object; define rows, columns, bounding box
```{r}
library(terra)
myrast <- rast(ncol = 10, nrow = 10, xmax = -116, xmin = -126, ymin = 42, ymax = 46)
str(myrast)
print(myrast)
```
## Your Turn
::: task
We just printed `myrast` and saw several components: `class`, `dimensions`, `resolution`, `extent`, and `coord. ref`. Explicitly return and inspect each of these pieces using `class()`, `ncell()`, `res()`, `ext()`, and `crs()`. Bonus: Return the number of raster cells using `ncell()`.
:::
```{r}
#| echo: false
countdown(minutes = 5)
```
## Our Turn
```{r}
class(myrast)
ncell(myrast)
res(myrast)
ext(myrast)
crs(myrast)
ncell(myrast)
```
## Manipulating Raster Objects
* Let's add values to the raster object
```{r}
values(myrast) <- 1:ncell(myrast)
```
```{r}
#| label: fig-rastplot
#| fig-cap: "Raster plot."
#| output-location: slide
#| fig-width: 8
#| results: hide
#| message: false
#| warning: false
plot(myrast)
```
## Reading a Raster Object
* Lets read in a raster object from the `spDataLarge` package that contains elevation data from Zion National Park
```{r}
raster_filepath <- system.file("raster/srtm.tif", package = "spDataLarge")
my_rast <- rast(raster_filepath)
```
```{r}
#| label: fig-rastplotzion
#| fig-cap: "Raster plot of elevation from Zion National Park."
#| output-location: slide
#| fig-width: 8
#| results: hide
#| message: false
#| warning: false
plot(my_rast)
```
## Your Turn
::: task
The `spatRaster` object in `terra` can hold multiple layers. Read in the four bands of Landsat 8 data from Zion National Park by running
```{r}
landsat <- system.file("raster/landsat.tif", package = "spDataLarge")
landsat <- rast(landsat)
```
Then plot this raster using `plot()`.
:::
```{r}
#| echo: false
countdown(minutes = 3)
```
## Our Turn
```{r}
#| label: fig-rastplotzion8
#| fig-cap: "Raster plot of Landsat 8 data from Zion National Park."
#| fig-width: 8
#| results: hide
#| message: false
#| warning: false
plot(landsat)
```
## Spatio-Temporal Raster Objects
* The `stars` package has tools for spatio-temporal raster data
* `stars` integrates with `sf` (recall that `terra` does not)
* Learn more at their website [https://r-spatial.github.io/stars/](https://r-spatial.github.io/stars/)
## Coordinate Reference Systems
A coordinate reference system (CRS) is made up of several components
1. Coordinate system: The x,y grid that defines where your data lies in space
2. Horizontal and vertical units: The units that describe the coordinate system
3. Datum: The modeled version of the earth's shape
4. Projection details (if applicable): The mathematical equation used to flatten objects from round surface (earth) to flat surface (paper or screen)
## The Ellipsoid and Geoid
* The earth can be thought of as an ellipsoid defined by a semi-major (equatorial) axis and a semi-minor (polar) axis
* More precisely, it is a geoid that is not perfectly smooth
* A datum is built on top of the ellipsoid and incorporates local features
* The most appropriate datum depends on the location of interest (in the US we typically use NAD83)
## The Ellipsoid and Geoid
```{r}
#| label: fig-ellipse
#| fig-cap: "The ellipsoid and geoid."
#| fig-align: center
#| echo: false
#| out-width: "75%"
knitr::include_graphics("img/Ellipsoid.png")
```
## Your Turn
::: task
There are several websites that are helpful for learning more about specific coordinate reference systems
1. Spatial Reference: [https://spatialreference.org/](https://spatialreference.org/)
2. Projection Wizard: [https://projectionwizard.org/](https://projectionwizard.org/)
3. EPSG: [https://epsg.io/](https://epsg.io/)
Spend a few minutes learning about one or more of these references by visiting their websites.
:::
```{r}
#| echo: false
countdown(minutes = 4)
```
## Specifying a CRS
There are many ways to specify a CRS:
1. Proj4
* `+proj = longlat + ellps=WGS84 ...`
2. OGC WKT / ESRI WKT
* `GEOGCS["WGS 84", DATUM["WGS_1984, SPHERIOD["WGS 84 ...]]]`
3. authority:code identifier (modern/preferred)
* `EPSG: 4326`
## Projected Coordinate Systems
* Projected coordinates have been projected to two-dimensional space according to a CRS
* Projected coordinates have an origin, x-axis, y-axis, and unit of measure
* Conformal projections preserve shape
* Equal-area projections preserve area
* Equidistant projections preserve distance
* Azimuthal projection preserves direction
## Projected Coordinate Systems
* Here an example using vector data; see the workbook for an example using raster data
```{r}
#| label: fig-transformcrs
#| fig-cap: "PNW State Boundaries."
#| output-location: slide
#| fig-width: 8
#| results: hide
#| message: false
#| warning: false
library(Rspatialworkshop)
data("pnw")
# transform one to the other
utmz11 <- 2153
pnw <- st_transform(pnw, crs = utmz11)
ggplot() +
geom_sf(data = pnw, color="black", fill=NA) +
labs(title="State Boundaries in the Pacific Northwest") +
theme_bw()
```
## Your Turn
::: task
Re-project the `pnw` data to a different projected CRS. Then plot using base **R** or `ggplot2`.
:::
```{r}
#| echo: false
countdown(minutes = 4)
```
## Our Turn
::: task2
Project to NAD83
:::
```{r}
pnw2 <- st_transform(pnw, crs = 5070)
plot(st_geometry(pnw2))
```
## A Note on `S2`
* `sf` version `1.0.0` supports spherical geometry operations via its interface to Google's `S2` spherical geometry engine
* `S2` is an example of a Discrete Global Grid System (DGGS)
* `sf` can run with `s2` on or off and by default the S2 geometry engine is turned on:
```{r}
#| eval: false
sf::sf_use_s2()
```
## Geographic Data I/O (Input/Ouptut)
There are several ways to read spatial data into R
1. Load spatial data from our machine or a remote source
2. Load spatial data as part of an **R** package
3. Load data using an API (which often makes use of an **R** package)
4. Convert flat files with x, y coordinates to spatial data
5. Geocoding data "by-hand" (we saw this earlier)
## Vector Data I/O
`sf` can read numerous file types:
1. shapefiles
2. geodatabases
3. geopackages
4. geojson
5. spatial database files
## Read in a Shapefile
```{r}
#| label: fig-citylims
#| fig-cap: "Oregon City Limits"
#| output-location: slide
#| fig-width: 8
#| results: hide
#| message: false
#| warning: false
filepath <- system.file("extdata/city_limits.shp", package = "Rspatialworkshop")
citylims <- read_sf(filepath)
plot(st_geometry(citylims), axes = T, main = "Oregon City Limits")
```
## Your Turn
::: task
Run `?read_sf()` and compare `read_sf()` to `st_read()`. Our preference is to use `read_sf()` -- why do you think that is?
:::
```{r}
#| echo: false
countdown(minutes = 3)
```
## Our Turn
::: task2
`read_sf()` calls `st_read()` with defaults chosen:
- `stringsAsFactors = FALSE`
- `quiet = TRUE`
- `as_tibble = TRUE`
:::
## Read in a Geodatabase
```{r}
#| label: fig-stateparks
#| fig-cap: "State Park Boundaries"
#| output-location: slide
#| fig-width: 8
#| results: hide
#| message: false
#| warning: false
filepath <- system.file("extdata/StateParkBoundaries.gdb", package = "Rspatialworkshop")
# List all feature classes in a file geodatabase
st_layers(filepath)
# Read the feature class
parks <- st_read(dsn = filepath, layer="StateParkBoundaries")
ggplot(parks) +
geom_sf()
```
## Read in a Geopackage
```{r}
#| label: fig-nc
#| fig-cap: "North Carolina Counties"
#| output-location: slide
#| fig-width: 8
#| results: hide
#| message: false
#| warning: false
filepath <- system.file("gpkg/nc.gpkg", package="sf")
nc <- read_sf(filepath)
ggplot(nc) +
geom_sf()
```
## Geopackage Advantages
Why use geopackages over shapefiles?
1. Geopackages avoid mult-file format of shapefiles
2. Geopackages avoid the 2gb limit of shapefiles
3. Geopackages are open-source and follow OGC standards
4. Geopackages are lighter in file size than shapefiles
5. Geopackages avoid the 10-character limit to column headers in shapefile attribute tables (stored in archaic `.dbf` files)
## Other Approaches
The workbook shows how to read in data using
1. Open spatial data sources
2. **R** packages
3. OpenStreetMap
## Raster Data I/O
* Read in data using `rast()`
```{r}
#| label: fig-barelev
#| fig-cap: "Elevation Data Barplot"
#| output-location: slide
#| fig-width: 8
#| results: hide
#| message: false
#| warning: false
filepath <- system.file("ex/elev.tif", package="terra")
elev <- rast(filepath)
barplot(elev, digits = -1, las = 2, ylab = "Frequency")
```
## Raster Data I/O
```{r}
#| label: fig-elev
#| fig-cap: "Elevation Data"
#| fig-width: 8
#| results: hide
#| message: false
#| warning: false
plot(elev)
```
## Raster Data I/O
```{r}
#| label: fig-L7
#| fig-cap: "Landsat 7"
#| output-location: slide
#| fig-width: 8
#| results: hide
#| message: false
#| warning: false
library(stars)
filepath <- system.file("tif/L7_ETMs.tif", package = "stars")
L7 <- read_stars(filepath) |>
slice(index = 1, along = "band")
plot(L7)
```
## Convert Flat Files to Spatial
It is common for a flat file (e.g., a `.csv`) to have columns that indicate coordinates -- how do we turn this into a spatial vector object?
```{r}
#| label: fig-gagesflat
#| fig-cap: "Stream Gages From a Flat File"
#| output-location: slide
#| fig-width: 8
#| results: hide
#| message: false
#| warning: false
filepath <- system.file("extdata/Gages_flowdata.csv", package = "Rspatialworkshop")
gages <- read.csv(filepath)
gages_sf <- gages |>
st_as_sf(coords = c("LON_SITE", "LAT_SITE"), crs = 4269, remove = FALSE)
ggplot(data = gages_sf) +
geom_sf()
```
# Visualization
## Goals
::: goals
* Gain familiarity with plotting and visualizing spatial data in R
* Work with four specific visualization and plotting libraries:
+ `ggplot2`
+ `leaflet`
+ `mapview`
+ `tmap`
:::
## Some Advice
Some advice from Martin Tennekes:
- If you know some `ggplot`, don't care about interactive maps, and don't want to spend a lot of time learning new packages, use `ggplot`
- If you want interactive maps as flexible as possible, use `leaflet`
- If you want to simply explore spatial objects ineractively as easily as possible, use `mapview`
- Otherwise, use `tmap`!
Projections are important!
## `ggplot2`
`ggplot2` now supports `geom_sf()`, developed alongside `sf` for pulication quality figures that easily incorporate projections
```{r}
library(tidycensus)
library(cowplot)
mult_tracts <- get_acs(state='OR',county='Multnomah',geography='tract', variables=c('B19013_001','B16010_028','B01003_001'), geometry=TRUE)
# tidy data
mult_wide <- mult_tracts |>
sf::st_transform(2153) |> #UTM 11N
dplyr::filter(!is.na(estimate)) |>
tidyr::pivot_wider(
names_from = c("variable"),
values_from = c("estimate","moe")
) |>
dplyr::rename(Median_Income=estimate_B19013_001, College_Education=estimate_B16010_028,Total_Pop=estimate_B01003_001) |>
dplyr::mutate(Perc_College_Ed = (College_Education / Total_Pop) * 100) |>
dplyr::filter(!is.na(Median_Income) & !is.na(Perc_College_Ed))
```
## `ggplot2`
```{r}
#| label: fig-medianmc
#| fig-cap: "Median income in Multnomah County."
#| fig-width: 8
#| results: hide
#| message: false
#| warning: false
mult_wide |>
ggplot() +
geom_sf(aes(fill = Median_Income)) +
scale_y_continuous() +
scale_fill_viridis_c(option = "mako") +
theme_minimal_grid(12)
```
## `ggplot2`
```{r}
#| label: fig-collegemc
#| fig-cap: "Percent college degree in Multnomah County."
#| fig-width: 8
#| results: hide
#| message: false
#| warning: false
mult_wide |>
ggplot() +
geom_sf(aes(fill = Perc_College_Ed)) +
scale_y_continuous() +
scale_fill_viridis_c(option = "mako") +
theme_minimal_grid(12)
```
## Your Turn
::: task
Visualize the counties in your state after an appropriate projection. You can access county data for your state (for us, Oregon) by running
:::
```{r}
your_state <- "Oregon"
counties <- tigris::counties("Oregon", cb = TRUE)
```
```{r}
#| echo: false
countdown(minutes = 4)
```
## Our Turn
```{r}
#| label: fig-orcounties
#| fig-cap: "Oregon Counties."
#| results: hide
#| message: false
#| warning: false
counties |>
st_transform(2991) |>
ggplot() +
geom_sf(data = counties, fill = NA, color = gray(.5))
```
## `leaflet`
Javascript based
```{r}
#| eval: false
library(leaflet)
content <- paste(sep = "<br/>",
"<b><a href='https://www.epa.gov/greeningepa/pacific-ecological-systems-division-pesd-laboratory'>EPA Lab Corvallis</a></b>",
"200 S.W. 35th Street ",
"Corvallis, OR 97333 "
)
leaflet() |>
addTiles() |>
addPopups(-123.290391, 44.565548, content,
options = popupOptions(closeButton = FALSE)