diff --git a/404.html b/404.html index 437b6f84..6cb53ba9 100644 --- a/404.html +++ b/404.html @@ -23,7 +23,7 @@ - + @@ -188,18 +188,24 @@
geom_area()
geom_bar()
geom_line()
geom_point()
geom_polygon()
geom_histogram()
geom_rect()
; geom_tile()
; geom_raster()
geom_text()
geom_area()
geom_bar()
geom_line()
geom_point()
geom_polygon()
geom_histogram()
geom_rect()
; geom_tile()
; geom_raster()
geom_text()
geom_area()
geom_bar()
geom_line()
geom_point()
geom_polygon()
geom_histogram()
geom_rect()
; geom_tile()
; geom_raster()
geom_text()
geom_area()
geom_bar()
geom_line()
geom_point()
geom_polygon()
geom_histogram()
geom_rect()
; geom_tile()
; geom_raster()
geom_text()
geom_area()
geom_bar()
geom_line()
geom_point()
geom_polygon()
geom_histogram()
geom_rect()
; geom_tile()
; geom_raster()
geom_text()
geom_text()
label
aesthetic, along with others# Filtering to simplify the example
+mpg |>
+ filter(manufacturer == "ford") |>
+ ggplot(aes(displ, hwy, label = model)) +
+ geom_text()
position
and other parameters are also useful.geom_area()
geom_bar()
geom_line()
geom_point()
geom_polygon()
geom_histogram()
geom_rect()
; geom_tile()
; geom_raster()
geom_text()
facets, multiple layers and statistics
- - + +“A coordinate system, coord for short, maps the position of objects onto the plane of the plot. Position is often specified by two coordinates (x, y), but could be any number of coordinates. The Cartesian coordinate system is the most common coordinate system @@ -614,11 +620,11 @@
Coord_polar
-ggplot(mpg, aes(displ, hwy, colour = factor(cyl))) +
- geom_bar(stat = "identity", position = "identity", fill = NA) +
- theme(legend.position = "none") +
- coord_polar()
ggplot(mpg, aes(displ, hwy, colour = factor(cyl))) +
+ geom_bar(stat = "identity", position = "identity", fill = NA) +
+ theme(legend.position = "none") +
+ coord_polar()
“The angle component is particularly useful for cyclical data because the starting and ending points of a single cycle are adjacent. Common cyclical variables are components of dates, like days of the year or hours of the day, and angles, like wind direction.” @@ -626,36 +632,36 @@
“In the grammar, a pie chart is a stacked bar geom drawn in a polar coordinate system.” (Wickham, 2010, p. 22)
- - + +Figure 15 shows this, as well as a bullseye plot, which arises when we map the height to radius instead of angle. (Wickham, 2010, p. 22)
- - + +The Coxcomb plot is a bar chart in polar coordinates. Note that the categories abut in the Coxcomb, but are separated in the bar chart: this is an example of a graphical convention that differs in different coordinate systems. (Wickham, 2010, p. 23)
-library(patchwork)
-a <- ggplot(diamonds,aes(x = clarity, fill=clarity)) + geom_bar(width =
-1) + theme(legend.position = "none")
-b <- ggplot(diamonds,aes(x = clarity, fill=clarity)) + geom_bar(width =
-1) + coord_polar (theta="y") + theme(legend.position = "none")
-a + b
library(patchwork)
+a <- ggplot(diamonds,aes(x = clarity, fill=clarity)) + geom_bar(width =
+1) + theme(legend.position = "none")
+b <- ggplot(diamonds,aes(x = clarity, fill=clarity)) + geom_bar(width =
+1) + coord_polar (theta="y") + theme(legend.position = "none")
+a + b
Defaults
The full ggplot2 specification of the scatterplot of price versus weight is:
- - + +geom_area()
geom_bar()
geom_line()
geom_point()
geom_polygon()
geom_histogram()
geom_rect()
; geom_tile()
; geom_raster()
geom_text()
All of these alternatives are allowed:
-ggplot(mpg, aes(displ, hwy, colour = class)) +
- geom_point()
-
-ggplot(mpg, aes(displ, hwy)) +
- geom_point(aes(colour = class))
-
-ggplot(mpg, aes(displ)) +
- geom_point(aes(y = hwy, colour = class))
-
-ggplot(mpg) +
- geom_point(aes(displ, hwy, colour = class))
ggplot(mpg, aes(displ, hwy, colour = class)) +
+ geom_point()
+
+ggplot(mpg, aes(displ, hwy)) +
+ geom_point(aes(colour = class))
+
+ggplot(mpg, aes(displ)) +
+ geom_point(aes(y = hwy, colour = class))
+
+ggplot(mpg) +
+ geom_point(aes(displ, hwy, colour = class))
But under some conditions, such as the use of a geom_smooth()
, the position of secondary arguments need to be specified in the layer parameters, as it is important for releasing correct results.
In the first case the smooth line doesn’t show up.
diff --git a/alpha-scales.html b/alpha-scales.html index 411b41c6..b565e78d 100644 --- a/alpha-scales.html +++ b/alpha-scales.html @@ -23,7 +23,7 @@ - + @@ -188,18 +188,24 @@geom_area()
geom_bar()
geom_line()
geom_point()
geom_polygon()
geom_histogram()
geom_rect()
; geom_tile()
; geom_raster()
geom_text()
geom_area()
geom_bar()
geom_line()
geom_point()
geom_polygon()
geom_histogram()
geom_rect()
; geom_tile()
; geom_raster()
geom_text()
base + annotate(
- geom = "text", x = 42, y = 20, label = "The Adelie species is on all 3 islands", size = 5, color = "darkcyan")
-
+base + annotate(
+ geom = "text", x = 42, y = 20, label = "The Adelie species is on all 3 islands", size = 5, color = "darkcyan")
Arrows Code
-base +
- annotate(
- geom = "curve", x = 53, y = 20, xend = 49, yend = 18.5,
- curvature = .3, size = 1, arrow = arrow(length = unit(3, "mm"))
- ) +
- annotate(geom = "text", x = 53.1, y = 20,
- label = "Average Chinstrap", hjust = "left", size = 4, color = "darkcyan") +
- annotate(
- geom = "curve", x = 35, y = 20, xend = 38, yend = 18.5,
- curvature = .3, size = 1, arrow = arrow(length = unit(3, "mm"))
- ) +
- annotate(geom = "text", x = 32, y = 20.3,
- label = "Average Adelie", hjust = "left", size = 4, color = "darkcyan") +
- annotate(
- geom = "curve", x = 53, y = 15, xend = 48, yend = 15,
- curvature = .3, size = 1, arrow = arrow(length = unit(3, "mm"))
- ) +
- annotate(geom = "text", x = 53, y = 15.3,
- label = "Average Gentoo", hjust = "left", size = 4, color = "darkcyan")
base +
+ annotate(
+ geom = "curve", x = 53, y = 20, xend = 49, yend = 18.5,
+ curvature = .3, size = 1, arrow = arrow(length = unit(3, "mm"))
+ ) +
+ annotate(geom = "text", x = 53.1, y = 20,
+ label = "Average Chinstrap", hjust = "left", size = 4, color = "darkcyan") +
+ annotate(
+ geom = "curve", x = 35, y = 20, xend = 38, yend = 18.5,
+ curvature = .3, size = 1, arrow = arrow(length = unit(3, "mm"))
+ ) +
+ annotate(geom = "text", x = 32, y = 20.3,
+ label = "Average Adelie", hjust = "left", size = 4, color = "darkcyan") +
+ annotate(
+ geom = "curve", x = 53, y = 15, xend = 48, yend = 15,
+ curvature = .3, size = 1, arrow = arrow(length = unit(3, "mm"))
+ ) +
+ annotate(geom = "text", x = 53, y = 15.3,
+ label = "Average Gentoo", hjust = "left", size = 4, color = "darkcyan")
Arrows Plot
-base +
- annotate(
- geom = "curve", x = 53, y = 20, xend = 49, yend = 18.5, curvature = .3, size = 1, arrow = arrow(length = unit(3, "mm"))
- ) +
- annotate(geom = "text", x = 53.1, y = 20, label = "Average Chinstrap", hjust = "left", size = 4, color = "darkcyan") +
- annotate(
- geom = "curve", x = 35, y = 20, xend = 38, yend = 18.5, curvature = .3, size = 1, arrow = arrow(length = unit(3, "mm"))
- ) +
- annotate(geom = "text", x = 32, y = 20.3, label = "Average Adelie", hjust = "left", size = 4, color = "darkcyan") +
- annotate(
- geom = "curve", x = 53, y = 15, xend = 48, yend = 15, curvature = .3, size = 1, arrow = arrow(length = unit(3, "mm"))
- ) +
- annotate(geom = "text", x = 53, y = 15.3, label = "Average Gentoo", hjust = "left", size = 4, color = "darkcyan") +
- theme(legend.position = "none")
astronauts %>%
- filter(nationality %in% c("U.S.","Australia", "U.K.", "U.S.S.R/Russia", "Japan")) %>%
- ggplot(aes(x = nationality, y = hours_mission, color = hours_mission)) +
- coord_flip() +
- geom_point(size = 4, alpha = 0.15) +
- geom_boxplot(color = "gray60", outlier.alpha = 0) +
- stat_summary(fun = mean, geom = "point", size = 5, color = "dodgerblue") +
- annotate(
- geom = "curve", x = 3.8, y = 2500, xend = 4, yend = 650,
- curvature = .3, arrow = arrow(length = unit(2, "mm"))
-) +
- annotate(
- "text", x = 3.7, y = 2500,
- label = "The U.S. Mean Hours Mission", size = 2.7) +
- annotate(
- geom = "curve", x = 4.7, y = 4200, xend = 5, yend = 2800,
- curvature = .3, arrow = arrow(length = unit(2, "mm"))
-) +
- annotate(
- "text", x = 4.5, y = 3700,
- label = "The interquartile range, between 25% and 75% of values", size = 2.8) +
- annotate(
- geom = "curve", x = 1, y = 3800, xend = 1, yend = 900,
- curvature = .3, arrow = arrow(length = unit(2, "mm"))
-) +
- annotate(
- "text", x = .8, y = 3000,
- label = "Australian Astronaut Andrew S. W. Thomas
- completed missions in 1983, 1998, 2001, 2005 and is now retired", size = 2.8) +
- scale_color_viridis_c() +
- scale_y_continuous(limits = c(0, 5000)) +
- labs(title = "Length of Astronaut Missions in hours",
- subtitle = "A Study was conducted on the effects of space on various individuals",
- caption = "Source: TidyTuesday 2020 week 29 \n inspired by plots in The Evolution of a ggplot (ep1) by Cedric Scherer") +
- theme_fivethirtyeight() +
- theme(legend.position = "none") +
- theme(plot.title = element_text(hjust = .5)) +
- theme(plot.subtitle = element_text(hjust = .5))
base +
+ annotate(
+ geom = "curve", x = 53, y = 20, xend = 49, yend = 18.5, curvature = .3, size = 1, arrow = arrow(length = unit(3, "mm"))
+ ) +
+ annotate(geom = "text", x = 53.1, y = 20, label = "Average Chinstrap", hjust = "left", size = 4, color = "darkcyan") +
+ annotate(
+ geom = "curve", x = 35, y = 20, xend = 38, yend = 18.5, curvature = .3, size = 1, arrow = arrow(length = unit(3, "mm"))
+ ) +
+ annotate(geom = "text", x = 32, y = 20.3, label = "Average Adelie", hjust = "left", size = 4, color = "darkcyan") +
+ annotate(
+ geom = "curve", x = 53, y = 15, xend = 48, yend = 15, curvature = .3, size = 1, arrow = arrow(length = unit(3, "mm"))
+ ) +
+ annotate(geom = "text", x = 53, y = 15.3, label = "Average Gentoo", hjust = "left", size = 4, color = "darkcyan") +
+ theme(legend.position = "none")
astronauts %>%
+ filter(nationality %in% c("U.S.","Australia", "U.K.", "U.S.S.R/Russia", "Japan")) %>%
+ ggplot(aes(x = nationality, y = hours_mission, color = hours_mission)) +
+ coord_flip() +
+ geom_point(size = 4, alpha = 0.15) +
+ geom_boxplot(color = "gray60", outlier.alpha = 0) +
+ stat_summary(fun = mean, geom = "point", size = 5, color = "dodgerblue") +
+ annotate(
+ geom = "curve", x = 3.8, y = 2500, xend = 4, yend = 650,
+ curvature = .3, arrow = arrow(length = unit(2, "mm"))
+) +
+ annotate(
+ "text", x = 3.7, y = 2500,
+ label = "The U.S. Mean Hours Mission", size = 2.7) +
+ annotate(
+ geom = "curve", x = 4.7, y = 4200, xend = 5, yend = 2800,
+ curvature = .3, arrow = arrow(length = unit(2, "mm"))
+) +
+ annotate(
+ "text", x = 4.5, y = 3700,
+ label = "The interquartile range, between 25% and 75% of values", size = 2.8) +
+ annotate(
+ geom = "curve", x = 1, y = 3800, xend = 1, yend = 900,
+ curvature = .3, arrow = arrow(length = unit(2, "mm"))
+) +
+ annotate(
+ "text", x = .8, y = 3000,
+ label = "Australian Astronaut Andrew S. W. Thomas
+ completed missions in 1983, 1998, 2001, 2005 and is now retired", size = 2.8) +
+ scale_color_viridis_c() +
+ scale_y_continuous(limits = c(0, 5000)) +
+ labs(title = "Length of Astronaut Missions in hours",
+ subtitle = "A Study was conducted on the effects of space on various individuals",
+ caption = "Source: TidyTuesday 2020 week 29 \n inspired by plots in The Evolution of a ggplot (ep1) by Cedric Scherer") +
+ theme_fivethirtyeight() +
+ theme(legend.position = "none") +
+ theme(plot.title = element_text(hjust = .5)) +
+ theme(plot.subtitle = element_text(hjust = .5))
geom_area()
geom_bar()
geom_line()
geom_point()
geom_polygon()
geom_histogram()
geom_rect()
; geom_tile()
; geom_raster()
geom_text()
geom_area()
geom_bar()
geom_line()
geom_point()
geom_polygon()
geom_histogram()
geom_rect()
; geom_tile()
; geom_raster()
geom_text()
geom_area()
geom_bar()
geom_line()
geom_point()
geom_polygon()
geom_histogram()
geom_rect()
; geom_tile()
; geom_raster()
geom_text()
geom_area()
geom_bar()
geom_line()
geom_point()
geom_polygon()
geom_histogram()
geom_rect()
; geom_tile()
; geom_raster()
geom_text()
geom_area()
geom_bar()
geom_line()
geom_point()
geom_polygon()
geom_histogram()
geom_rect()
; geom_tile()
; geom_raster()
geom_text()
geom_area()
geom_bar()
geom_line()
geom_point()
geom_polygon()
geom_histogram()
geom_rect()
; geom_tile()
; geom_raster()
geom_text()
geom_area()
geom_bar()
geom_line()
geom_point()
geom_polygon()
geom_histogram()
geom_rect()
; geom_tile()
; geom_raster()
geom_text()
geom_area()
geom_bar()
geom_line()
geom_point()
geom_polygon()
geom_histogram()
geom_rect()
; geom_tile()
; geom_raster()
geom_text()
geom_area()
geom_bar()
geom_line()
geom_point()
geom_polygon()
geom_histogram()
geom_rect()
; geom_tile()
; geom_raster()
geom_text()
geom_area()
geom_bar()
geom_line()
geom_point()
geom_polygon()
geom_histogram()
geom_rect()
; geom_tile()
; geom_raster()
geom_text()
geom_area()
geom_bar()
geom_line()
geom_point()
geom_polygon()
geom_histogram()
geom_rect()
; geom_tile()
; geom_raster()
geom_text()
geom_area()
geom_bar()
geom_line()
geom_point()
geom_polygon()
geom_histogram()
geom_rect()
; geom_tile()
; geom_raster()
geom_text()
ggplot(mpg, aes(displ, hwy, colour = factor(cyl))) +
- geom_line() +
- theme(legend.position = "none")
ggplot(mpg, aes(displ, hwy, colour = factor(cyl))) +
- geom_bar(stat = "identity", position = "identity", fill = NA) +
- theme(legend.position = "none")
ggplot(mpg, aes(displ, hwy, colour = factor(cyl))) +
- geom_point() +
- geom_smooth(method = "lm") +
- labs(title = "What type of graph would you call this?", subtitle = "Notice the defaults of ggplot2") +
- theme(plot.title = element_text(size = 15, color =
- "firebrick", face = "bold", hjust = .5)) +
- theme(plot.subtitle = element_text(hjust = .5))
ggplot(mpg, aes(displ, hwy, colour = factor(cyl))) +
+ geom_line() +
+ theme(legend.position = "none")
ggplot(mpg, aes(displ, hwy, colour = factor(cyl))) +
+ geom_bar(stat = "identity", position = "identity", fill = NA) +
+ theme(legend.position = "none")
ggplot(mpg, aes(displ, hwy, colour = factor(cyl))) +
+ geom_point() +
+ geom_smooth(method = "lm") +
+ labs(title = "What type of graph would you call this?", subtitle = "Notice the defaults of ggplot2") +
+ theme(plot.title = element_text(size = 15, color =
+ "firebrick", face = "bold", hjust = .5)) +
+ theme(plot.subtitle = element_text(hjust = .5))
geom_area()
geom_bar()
geom_line()
geom_point()
geom_polygon()
geom_histogram()
geom_rect()
; geom_tile()
; geom_raster()
geom_text()
geom_area()
geom_bar()
geom_line()
geom_point()
geom_polygon()
geom_histogram()
geom_rect()
; geom_tile()
; geom_raster()
geom_text()
geom_area()
geom_bar()
geom_line()
geom_point()
geom_polygon()
geom_histogram()
geom_rect()
; geom_tile()
; geom_raster()
geom_text()
geom_area()
geom_bar()
geom_line()
geom_point()
geom_polygon()
geom_histogram()
geom_rect()
; geom_tile()
; geom_raster()
geom_text()
geom_area()
geom_bar()
geom_line()
geom_point()
geom_polygon()
geom_histogram()
geom_rect()
; geom_tile()
; geom_raster()
geom_text()
geom_area()
geom_bar()
geom_line()
geom_point()
geom_polygon()
geom_histogram()
geom_rect()
; geom_tile()
; geom_raster()
geom_text()
geom_area()
geom_bar()
geom_line()
geom_point()
geom_polygon()
geom_histogram()
geom_rect()
; geom_tile()
; geom_raster()
geom_text()
geom_area()
geom_bar()
geom_line()
geom_point()
geom_polygon()
geom_histogram()
geom_rect()
; geom_tile()
; geom_raster()
geom_text()
print(CoordCartesian$transform)
+
<ggproto method>
<Wrapper function>
function (...)
@@ -617,7 +623,7 @@ 19.5 Coordsprint(coord_sf)
+
diff --git a/data-sources.html b/data-sources.html
index 42aaffa6..7c44ad5e 100644
--- a/data-sources.html
+++ b/data-sources.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/data.html b/data.html
index ce024b8f..3c6c17db 100644
--- a/data.html
+++ b/data.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/date-time-colour-scales.html b/date-time-colour-scales.html
index 18c31e91..517320e8 100644
--- a/date-time-colour-scales.html
+++ b/date-time-colour-scales.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/date-time.html b/date-time.html
index ded6e658..92f6f0af 100644
--- a/date-time.html
+++ b/date-time.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/dealing-with-overplotting.html b/dealing-with-overplotting.html
index 45dc96bb..f9fd3bd2 100644
--- a/dealing-with-overplotting.html
+++ b/dealing-with-overplotting.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -604,20 +610,20 @@
4.5 Dealing with overplotting
To compensate for Over plotting, tweaking the aesthetic can help. Techniques like hollow glyphs can help.
-df <- data.frame(x = rnorm(2000), y = rnorm(2000))
-norm <- ggplot(df, aes(x, y)) + xlab(NULL) + ylab(NULL)
-norm + geom_point()
+df <- data.frame(x = rnorm(2000), y = rnorm(2000))
+norm <- ggplot(df, aes(x, y)) + xlab(NULL) + ylab(NULL)
+norm + geom_point()
-
+
-
+
Alternative ways using large data sets, you can use alpha blending (transparency). If you specify alpha
as a ratio, the denominator gives the number of points that must be over plotted to give a solid color.
-
+
-
+
-
+
geom_jitter()
can be used if your data has some discreteness. By default, 40% is used. You can overide the default with width
and height
arguments.
Alternatively, we can think of overplotting as a 2d density estimation problem, which gives rise to two more approaches:
@@ -627,14 +633,14 @@ 4.5 Dealing with overplottingThe code below compares square and hexagonal bins, using parameters bins
and binwidth
to control the number and size of the bins.
-
+
-
+
-
+
-
+
Another approach to dealing with overplotting is to add data summaries to help guide the eye to the true shape of the pattern within the data.
diff --git a/defintions-in-this-chapter.html b/defintions-in-this-chapter.html
index f246b65a..bf99c7f6 100644
--- a/defintions-in-this-chapter.html
+++ b/defintions-in-this-chapter.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/directlabels-package.html b/directlabels-package.html
index fac3815a..b2be1445 100644
--- a/directlabels-package.html
+++ b/directlabels-package.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -601,51 +607,51 @@
7.5 Directlabels Package
+
Base Code Nurse Salary
-library(ggthemes)
-library(scales)
-library(ggthemes)
-library(scales)
-g <-
-nurses %>%
- group_by(year) %>%
- filter(state %in% c("Minnesota", "Wisconsin", "Iowa", "North Dakota", "Illinois", "Indiana", "Kansas", "Michigan", "Missouri", "Nebraska", "Ohio")) %>%
- ggplot(aes(year, annual_salary_median, color = state, )) +
- geom_line() +
- labs(
- title = "Annual Median RN Salary by Midwestern State"
- ) +
- theme(legend.position = "none") +
- geom_vline(xintercept = c(2007, 2009), size = 1.5,
- color = "darkgoldenrod1", linetype = "dashed") +
- gghighlight::gghighlight(state == c("Minnesota", "Wisconsin", "Iowa")) +
- theme_economist() +
- scale_color_economist(name = NULL) +
- theme(axis.title = element_blank()) +
- scale_y_continuous(labels = comma_format())
-
+library(ggthemes)
+library(scales)
+library(ggthemes)
+library(scales)
+g <-
+nurses %>%
+ group_by(year) %>%
+ filter(state %in% c("Minnesota", "Wisconsin", "Iowa", "North Dakota", "Illinois", "Indiana", "Kansas", "Michigan", "Missouri", "Nebraska", "Ohio")) %>%
+ ggplot(aes(year, annual_salary_median, color = state, )) +
+ geom_line() +
+ labs(
+ title = "Annual Median RN Salary by Midwestern State"
+ ) +
+ theme(legend.position = "none") +
+ geom_vline(xintercept = c(2007, 2009), size = 1.5,
+ color = "darkgoldenrod1", linetype = "dashed") +
+ gghighlight::gghighlight(state == c("Minnesota", "Wisconsin", "Iowa")) +
+ theme_economist() +
+ scale_color_economist(name = NULL) +
+ theme(axis.title = element_blank()) +
+ scale_y_continuous(labels = comma_format())
+
gghighlight and facets
-
-
+
+
examples in geom_richtext
-library(ggtext)
-
-lab_html <- "★ geom_richtext can modify with hmtl"
-
-g +
- geom_richtext(aes(x = 2010, y = 50000, label = lab_html),
- stat = "unique", angle = 30, color = "white", fill = "steelblue")
-
+library(ggtext)
+
+lab_html <- "★ geom_richtext can modify with hmtl"
+
+g +
+ geom_richtext(aes(x = 2010, y = 50000, label = lab_html),
+ stat = "unique", angle = 30, color = "white", fill = "steelblue")
+
geom_textbox
-lab_long <- "**The Great Recession** <br><b style='font-size:10pt;color:steelblue;'> Minnesota's RN Annual Salaries increased during the great receision and then completely flatted out before rising again after 2015"
-
-g +
- geom_textbox(aes(x = 2015, y = 40000, label = lab_long),
- width = unit(15, "lines"), stat = "unique")
-
+lab_long <- "**The Great Recession** <br><b style='font-size:10pt;color:steelblue;'> Minnesota's RN Annual Salaries increased during the great receision and then completely flatted out before rising again after 2015"
+
+g +
+ geom_textbox(aes(x = 2015, y = 40000, label = lab_long),
+ width = unit(15, "lines"), stat = "unique")
+
diff --git a/discrete-colour-scales.html b/discrete-colour-scales.html
index d9f88d02..4a9c0254 100644
--- a/discrete-colour-scales.html
+++ b/discrete-colour-scales.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/discrete.html b/discrete.html
index 5b6f9f32..58b80126 100644
--- a/discrete.html
+++ b/discrete.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/displaying-distributions.html b/displaying-distributions.html
index dcf1ed14..64839ec7 100644
--- a/displaying-distributions.html
+++ b/displaying-distributions.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -604,13 +610,13 @@
4.4 Displaying distributions
For 1-Dimensional continuous data (1d), the histogram is arguably the most important geom
-
+
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
-
+
## Warning: Removed 45 rows containing non-finite outside the scale range
## (`stat_bin()`).
## Warning: Removed 2 rows containing missing values or values outside the scale range
@@ -622,46 +628,46 @@ 4.4 Displaying distributionsfacet_wrap(~ var)
.
- Use colour and a frequency polygon, geom_freqpoly()
.
- Use a “conditional density plot”, geom_histogram(position = "fill")
.
-ggplot(diamonds, aes(depth)) +
- geom_freqpoly(aes(colour = cut), binwidth = 0.1, na.rm = TRUE) +
- xlim(58, 68) +
- theme(legend.position = "none")
+ggplot(diamonds, aes(depth)) +
+ geom_freqpoly(aes(colour = cut), binwidth = 0.1, na.rm = TRUE) +
+ xlim(58, 68) +
+ theme(legend.position = "none")
-ggplot(diamonds, aes(depth)) +
- geom_histogram(aes(fill = cut), binwidth = 0.1, position = "fill",
- na.rm = TRUE) +
- xlim(58, 68) +
- theme(legend.position = "none")
+ggplot(diamonds, aes(depth)) +
+ geom_histogram(aes(fill = cut), binwidth = 0.1, position = "fill",
+ na.rm = TRUE) +
+ xlim(58, 68) +
+ theme(legend.position = "none")
You can also plot density using geom_density()
. Use a density plot when you know that the underlying density is smooth, continuous and unbounded.
-ggplot(diamonds, aes(depth)) +
- geom_density(na.rm = TRUE) +
- xlim(58, 68) +
- theme(legend.position = "none")
+ggplot(diamonds, aes(depth)) +
+ geom_density(na.rm = TRUE) +
+ xlim(58, 68) +
+ theme(legend.position = "none")
-ggplot(diamonds, aes(depth, fill = cut, colour = cut)) +
- geom_density(alpha = 0.2, na.rm = TRUE) +
- xlim(58, 68) +
- theme(legend.position = "none")
+ggplot(diamonds, aes(depth, fill = cut, colour = cut)) +
+ geom_density(alpha = 0.2, na.rm = TRUE) +
+ xlim(58, 68) +
+ theme(legend.position = "none")
It is often the case and advisable to sacrifice quality for quantity. The following three types of graph provide examples of this thought.
geom_boxplot()
:
-
+
- ggplot(diamonds, aes(carat, depth)) +
- geom_boxplot(aes(group = cut_width(carat, 0.1))) +
- xlim(NA, 2.05)
+ ggplot(diamonds, aes(carat, depth)) +
+ geom_boxplot(aes(group = cut_width(carat, 0.1))) +
+ xlim(NA, 2.05)
## Warning: Removed 997 rows containing missing values or values outside the scale range
## (`stat_boxplot()`).
geom_violin()
:
-
+
-ggplot(diamonds, aes(carat, depth)) +
- geom_violin(aes(group = cut_width(carat, 0.1))) +
- xlim(NA, 2.05)
+ggplot(diamonds, aes(carat, depth)) +
+ geom_violin(aes(group = cut_width(carat, 0.1))) +
+ xlim(NA, 2.05)
## Warning: Removed 997 rows containing non-finite outside the scale range
## (`stat_ydensity()`).
@@ -674,8 +680,8 @@ 4.4.1 Exercise:ggplot(diamonds, aes(price)) +
- geom_histogram(binwidth = 5)
+
The smaller the quantity (assuming quality), the higher the price. I presume that carat size would also have a strong correlation with quantity and price.
@@ -683,11 +689,11 @@ 4.4.1 Exercise:
- How does the distribution of
price
vary with clarity
?
-
+
-
+
I presume using different geoms, the higher the clarity, the higher the price, the fewer the quantity.
diff --git a/what-low-level-geoms-are-used-to-draw-geom_boxplot.html b/drawing-rectangles-geom_rect-geom_tile-geom_raster.html
similarity index 92%
rename from what-low-level-geoms-are-used-to-draw-geom_boxplot.html
rename to drawing-rectangles-geom_rect-geom_tile-geom_raster.html
index e348b6af..f1e3abf7 100644
--- a/what-low-level-geoms-are-used-to-draw-geom_boxplot.html
+++ b/drawing-rectangles-geom_rect-geom_tile-geom_raster.html
@@ -4,18 +4,18 @@
- 2.8 What low-level geoms are used to draw geom_boxplot()? | ggplot2 Book Club
+ 2.8 Drawing rectangles: geom_rect(); geom_tile(); geom_raster() | ggplot2 Book Club
-
+
-
+
@@ -23,15 +23,15 @@
-
+
-
-
+
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -594,19 +600,19 @@
-
-
-
+
+
diff --git a/exercise-solutions.html b/exercise-solutions.html
new file mode 100644
index 00000000..a3453613
--- /dev/null
+++ b/exercise-solutions.html
@@ -0,0 +1,738 @@
+
+
+
+
+
+
+ 2.10 Exercise solutions | ggplot2 Book Club
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+2.10 Exercise solutions
+
+2.10.1 Exercise 1
+
+- What geoms would you use to draw each of the following named plots?
+
+- scatterplot =
geom_point()
+- line chart =
geom_line()
+- histogram =
geom_histogram()
+- bar chart =
geom_bar()
or geom_col()
+- pie chart =
geom_bar()
with coord_polar()
+
+
+
+
+
+
+
+
+2.10.2 Exercise 2
+
+geom_path()
connects points in order of appearance. geom_line
connects points from left to right.
+
+
+
+
+geom_polygon()
draws polygons which are filled paths.
+
+
+
+
+geom_line()
connects points from left to right.
+
+
+
+
+
+2.10.3 Exercise 3
+
+- What low-level geoms are used to draw geom_smooth()?
+
+geom_smooth()
fits a smoother to data, displaying the smooth and its standard error, allowing you to see a dominant pattern within a scatterplot with a lot of “noise”. The low level geom for geom_smooth()
are geom_path()
, geom_area()
and geom_point()
.
+
+
+
+## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
+
+
+- What low-level geoms are used to draw geom_boxplot()?
+
+- Box plots are used to summarize the distribution of a set of points using summary statistics. The low level geom for
geom_boxplot()
are geom_rect()
, geom_line()
and geom_point()
.
+
+
+
+
+
+- What low-level geoms are used to draw geom_violin()?
+
+- Violin plots show a compact representation of the density of the distribution highlighting the areas where most of the points are found. The low level geom for
geom_violin()
are geom_area()
and geom_path()
.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/extending-ggplot2.html b/extending-ggplot2.html
index d4209966..069435b9 100644
--- a/extending-ggplot2.html
+++ b/extending-ggplot2.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/extra.html b/extra.html
index 9619f79a..43429014 100644
--- a/extra.html
+++ b/extra.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/faceting-2.html b/faceting-2.html
index f17ad650..30dda039 100644
--- a/faceting-2.html
+++ b/faceting-2.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/faceting-annotations.html b/faceting-annotations.html
index 6b8760c2..878dc4c4 100644
--- a/faceting-annotations.html
+++ b/faceting-annotations.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -596,21 +602,21 @@
7.6 Faceting Annotations
-
-
+
+
Grid package scales coordinates between 0 and 1
-
-
+
+
diff --git a/faceting.html b/faceting.html
index cbfe59d9..5c0615ef 100644
--- a/faceting.html
+++ b/faceting.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/facets.html b/facets.html
index 573c6b63..e327000b 100644
--- a/facets.html
+++ b/facets.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -596,14 +602,14 @@
15.1 Facets
-
-base <- ggplot(mpg2, aes(displ, hwy)) +
- geom_blank() +
- xlab(NULL) +
- ylab(NULL)
-
-mpg2%>%count(class)
+
+base <- ggplot(mpg2, aes(displ, hwy)) +
+ geom_blank() +
+ xlab(NULL) +
+ ylab(NULL)
+
+mpg2%>%count(class)
# A tibble: 6 × 2
class n
<chr> <int>
@@ -613,27 +619,27 @@ 15.1 Facetsbase + facet_wrap(~class, ncol = 3)
-
-
-
-
-
-
-
-
-
-
-
-
-
-p <- ggplot(mpg, aes(cty, hwy)) +
- geom_abline() +
- geom_jitter(width = 0.1, height = 0.1)
-p +
- facet_grid(drv ~ cyl)
-
-
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+p <- ggplot(mpg, aes(cty, hwy)) +
+ geom_abline() +
+ geom_jitter(width = 0.1, height = 0.1)
+p +
+ facet_grid(drv ~ cyl)
+
+
<ggproto object: Class FacetWrap, Facet, gg>
compute_layout: function
draw_back: function
@@ -650,10 +656,10 @@ 15.1 Facetsp+
- facet_wrap(~cyl, scales = "free_y")
-
-
+
+
+
# A tibble: 574 × 2
date n
<date> <int>
@@ -668,62 +674,62 @@ 15.1 Facetsggplot(economics_long, aes(date, value)) +
- geom_line() +
- facet_wrap(~variable, scales = "free_y", ncol = 1)
-
-mpg2$model <- reorder(mpg2$model, mpg2$cty)
-
-mpg2$manufacturer <- reorder(mpg2$manufacturer, -mpg2$cty)
-
-ggplot(mpg2, aes(cty, model)) +
- geom_point() +
- facet_grid(manufacturer ~ ., scales = "free", space = "free") +
- theme(strip.text.y = element_text(angle = 0))
-
-
-ggplot(df1, aes(x, y)) +
- geom_point(data = df2, colour = "red", size = 2) +
- geom_point() +
- facet_wrap(~gender)
-
-df <- data.frame(
- x = rnorm(120, c(0, 2, 4)),
- y = rnorm(120, c(1, 2, 1)),
- z = letters[1:3]
-)
-
-ggplot(df, aes(x, y)) +
- geom_point(aes(colour = z))
-
-
-
-df_sum <- df %>%
- group_by(z) %>%
- summarise(x = mean(x), y = mean(y)) %>%
- rename(z2 = z)
-
-
-ggplot(df, aes(x, y)) +
- geom_point() +
- geom_point(data = df_sum, aes(colour = z2), size = 4) +
- facet_wrap(~z)
-
-df2 <- dplyr::select(df, -z)
-
-ggplot(df, aes(x, y)) +
- geom_point(data = df2, colour = "grey70") +
- geom_point(aes(colour = z)) +
- facet_wrap(~z)
-
-age<-seq(18,60,1)
-id <- seq(1,42,1)
-my_df <- as.data.frame(cbind(id,age))
-
-my_df %>% mutate(age_cat=cut_interval(age,length=5))%>%head()
+ggplot(economics_long, aes(date, value)) +
+ geom_line() +
+ facet_wrap(~variable, scales = "free_y", ncol = 1)
+
+mpg2$model <- reorder(mpg2$model, mpg2$cty)
+
+mpg2$manufacturer <- reorder(mpg2$manufacturer, -mpg2$cty)
+
+ggplot(mpg2, aes(cty, model)) +
+ geom_point() +
+ facet_grid(manufacturer ~ ., scales = "free", space = "free") +
+ theme(strip.text.y = element_text(angle = 0))
+
+
+ggplot(df1, aes(x, y)) +
+ geom_point(data = df2, colour = "red", size = 2) +
+ geom_point() +
+ facet_wrap(~gender)
+
+df <- data.frame(
+ x = rnorm(120, c(0, 2, 4)),
+ y = rnorm(120, c(1, 2, 1)),
+ z = letters[1:3]
+)
+
+ggplot(df, aes(x, y)) +
+ geom_point(aes(colour = z))
+
+
+
+df_sum <- df %>%
+ group_by(z) %>%
+ summarise(x = mean(x), y = mean(y)) %>%
+ rename(z2 = z)
+
+
+ggplot(df, aes(x, y)) +
+ geom_point() +
+ geom_point(data = df_sum, aes(colour = z2), size = 4) +
+ facet_wrap(~z)
+
+df2 <- dplyr::select(df, -z)
+
+ggplot(df, aes(x, y)) +
+ geom_point(data = df2, colour = "grey70") +
+ geom_point(aes(colour = z)) +
+ facet_wrap(~z)
+
+age<-seq(18,60,1)
+id <- seq(1,42,1)
+my_df <- as.data.frame(cbind(id,age))
+
+my_df %>% mutate(age_cat=cut_interval(age,length=5))%>%head()
id age age_cat
1 1 18 [15,20]
2 2 19 [15,20]
@@ -731,18 +737,18 @@ 15.1 Facets# Bins of width 1
-mpg2$disp_w <- cut_width(mpg2$displ, 1)
-# Six bins of equal length
-mpg2$disp_i <- cut_interval(mpg2$displ, 6)
-# Six bins containing equal numbers of points
-mpg2$disp_n <- cut_number(mpg2$displ, 6)
-
-plot <- ggplot(mpg2, aes(cty, hwy)) +
- geom_point() +
- labs(x = NULL, y = NULL)
-plot + facet_wrap(~disp_w, nrow = 1)
-
+# Bins of width 1
+mpg2$disp_w <- cut_width(mpg2$displ, 1)
+# Six bins of equal length
+mpg2$disp_i <- cut_interval(mpg2$displ, 6)
+# Six bins containing equal numbers of points
+mpg2$disp_n <- cut_number(mpg2$displ, 6)
+
+plot <- ggplot(mpg2, aes(cty, hwy)) +
+ geom_point() +
+ labs(x = NULL, y = NULL)
+plot + facet_wrap(~disp_w, nrow = 1)
+
diff --git a/first-steps.html b/first-steps.html
index 245f1473..7bfa71b2 100644
--- a/first-steps.html
+++ b/first-steps.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/from-the-ggplot2-book.html b/from-the-ggplot2-book.html
index 2629f2cd..f611c8b6 100644
--- a/from-the-ggplot2-book.html
+++ b/from-the-ggplot2-book.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -601,9 +607,9 @@
3.4 From the ggplot2 bookalso a longitudinal study.
- note that the age is standardized.
-
+
## Grouped Data: height ~ age | Subject
## Subject age height Occasion
## 1 1 -1.0000 140.5 1
@@ -632,16 +638,16 @@ 3.4.1 Multiple Groups, One Aesthe
In the case of Oxboys, we want to plot a line over time for each boy, so Subject
is the grouping variable in the aesthetic.
-
+
- incorrectly specifying the grouping variable leads to a “characteristic sawtooth appearance”.
-
+
@@ -653,13 +659,13 @@ 3.4.2 Different Groups on Differe
- now that we have plotted individual geoms, let’s add a collective geom which is the trendline for all boys together.
-ggplot(Oxboys, aes(age, height, group = Subject)) +
- geom_line() +
- geom_point() +
- geom_smooth(method = "lm", se = FALSE)
+ggplot(Oxboys, aes(age, height, group = Subject)) +
+ geom_line() +
+ geom_point() +
+ geom_smooth(method = "lm", se = FALSE)
## `geom_smooth()` using formula = 'y ~ x'
-
+
- something doesn’t look right
- expecting a collective geom (one summary line for all subjects), but we got individual geoms again – a trendline for each individual instead of a trendline for all individuals.
@@ -667,13 +673,13 @@ 3.4.2 Different Groups on Differe
- we got multiple
geom_smooth
s because we had the grouping variable in the ggplot
line so the grouping flows down to all layers of the plot
- to get what we intend, we need to uncouple the grouping variable at the
ggplot
layer and add it where we want the grouping to happen, namely only at the geom_line
layer. That allows the default grouping from the ggplot
layer (i.e., no special grouping or just group on the whole dataset) to flow down to the geom_smooth
layer.
-ggplot(Oxboys, aes(age, height)) +
- geom_line(aes(group = Subject)) +
- geom_point() +
- geom_smooth(method = "lm", size = 2, se = FALSE)
+ggplot(Oxboys, aes(age, height)) +
+ geom_line(aes(group = Subject)) +
+ geom_point() +
+ geom_smooth(method = "lm", size = 2, se = FALSE)
## `geom_smooth()` using formula = 'y ~ x'
-
+
3.4.3 Overriding the Default Grouping
@@ -682,20 +688,20 @@ 3.4.3 Overriding the Default Grou
By adding the grouping to geom_line
, we overrode the default grouping, which was “no special grouping”.
Here’s another example to help illustrate this point a little better. Thanks to this blog post.
Subtitles are added to these plots to describe what’s going on.
-ggplot(mpg, aes(drv, hwy)) +
- geom_jitter() +
- stat_boxplot(fill = NA) +
- labs(subtitle = "stat_boxplot automatically uses the groups set by the categorical variable drv.\nNotice that there is only one boxplot for each value of drv.")
+ggplot(mpg, aes(drv, hwy)) +
+ geom_jitter() +
+ stat_boxplot(fill = NA) +
+ labs(subtitle = "stat_boxplot automatically uses the groups set by the categorical variable drv.\nNotice that there is only one boxplot for each value of drv.")
-ggplot(mpg, aes(drv, hwy, color = factor(year))) +
- geom_jitter() +
- stat_boxplot(fill = NA) +
- labs(subtitle = "by now adding color based on year, it creates a new group for the boxplots as well,\nand there are now two for each categorical. This may not be what you want.")
+ggplot(mpg, aes(drv, hwy, color = factor(year))) +
+ geom_jitter() +
+ stat_boxplot(fill = NA) +
+ labs(subtitle = "by now adding color based on year, it creates a new group for the boxplots as well,\nand there are now two for each categorical. This may not be what you want.")
-ggplot(mpg, aes(drv, hwy, color = factor(year))) +
-geom_jitter() +
-stat_boxplot(fill = NA, aes(group = drv)) +
- labs(subtitle = "we override the default or earlier grouping by adding\na group -- inside the aes -- on the layer where we want it")
+ggplot(mpg, aes(drv, hwy, color = factor(year))) +
+geom_jitter() +
+stat_boxplot(fill = NA, aes(group = drv)) +
+ labs(subtitle = "we override the default or earlier grouping by adding\na group -- inside the aes -- on the layer where we want it")
## Warning: The following aesthetics were dropped during statistical transformation:
## colour.
## ℹ This can happen when ggplot fails to infer the correct grouping structure in
@@ -706,34 +712,34 @@ 3.4.3 Overriding the Default Grou
3.4.4 A couple of exercises
-
+
## # A tibble: 2 × 11
## manufacturer model displ year cyl trans drv cty hwy fl class
## <chr> <chr> <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr>
## 1 audi a4 1.8 1999 4 auto(l5) f 18 29 p compa…
## 2 audi a4 1.8 1999 4 manual(m5) f 21 29 p compa…
-#Draw a boxplot of hwy for each value of cyl, without turning cyl into a factor. What extra aesthetic do you need to set?
-
-# Wrong... but cyl is an integer data type -- are integers considered continuous?
-ggplot(mpg, aes(cyl, hwy)) +
- geom_boxplot()
+#Draw a boxplot of hwy for each value of cyl, without turning cyl into a factor. What extra aesthetic do you need to set?
+
+# Wrong... but cyl is an integer data type -- are integers considered continuous?
+ggplot(mpg, aes(cyl, hwy)) +
+ geom_boxplot()
## Warning: Continuous x aesthetic
## ℹ did you forget `aes(group = ...)`?
-
+
-#Modify the following plot so that you get one boxplot per integer value of displ.
-
-ggplot(mpg, aes(displ, cty)) +
- geom_boxplot()
+#Modify the following plot so that you get one boxplot per integer value of displ.
+
+ggplot(mpg, aes(displ, cty)) +
+ geom_boxplot()
## Warning: Continuous x aesthetic
## ℹ did you forget `aes(group = ...)`?
-
+
diff --git a/functional-programming.html b/functional-programming.html
index 890d7a65..0c571358 100644
--- a/functional-programming.html
+++ b/functional-programming.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -597,59 +603,59 @@
17.3 Functional programming
An example is to make a geom. For this we can have a look at the “Corporate Reputation” data from #TidyTuesday 2022 week22.
-poll <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-05-31/poll.csv')
-reputation <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-05-31/reputation.csv')
-
-
-rep2<-reputation%>%
- group_by(company,industry)%>%
- summarize(score,rank)%>%
- ungroup()%>%
- mutate(year=2022)
-
-
-full <- poll%>%
- filter(!is.na(year))%>%
- full_join(rep2,by=c("2022_rank"="rank","2022_rq"="score","company","industry","year")) %>%
- count(year,company,industry,"rank"=`2022_rank`,"score"=`2022_rq`,sort=T) %>%
- arrange(-year)
-
-##################
-
-# mapping = aes(x = fct_reorder(x,-y), y = y, fill = y, color = y, label = y)
-
-rank_plot <- function(data,mapping) {
- data %>%
- ggplot(mapping)+ # aes(x=fct_reorder(x,-y),y=y)
- geom_col(width =0.3, # aes(fill=rank)
- show.legend = F)+
- geom_text(hjust=0,fontface="bold", # aes(label=rank,color=rank),
- show.legend = F)+
- scale_y_discrete(expand = c(0, 0, .5, 0))+
- coord_flip()+
- ggthemes::scale_fill_continuous_tableau(palette = "Green-Gold")+
- ggthemes::scale_color_continuous_tableau(palette = "Green-Gold")+
- labs(title="",
- x="",y="")+
- theme(axis.text.x = element_blank(),
- axis.text.y = element_text(face="bold"),
- axis.ticks.x = element_blank(),
- axis.ticks.y = element_line(size=2),
- panel.grid.major.x = element_blank(),
- panel.grid.minor.x = element_blank(),
- panel.grid.major.y = element_line(size=2),
- plot.background = element_rect(color="grey95",fill="grey95"),
- panel.background = element_rect(color="grey92",fill="grey92"))
-}
-
-df<-full%>%
- filter(year==2017,
- industry=="Retail")
-
-rank_plot(data = df,
- mapping = aes(x=fct_reorder(company,-rank),y=rank,
- fill = rank, color = rank, label = rank))
-
+poll <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-05-31/poll.csv')
+reputation <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-05-31/reputation.csv')
+
+
+rep2<-reputation%>%
+ group_by(company,industry)%>%
+ summarize(score,rank)%>%
+ ungroup()%>%
+ mutate(year=2022)
+
+
+full <- poll%>%
+ filter(!is.na(year))%>%
+ full_join(rep2,by=c("2022_rank"="rank","2022_rq"="score","company","industry","year")) %>%
+ count(year,company,industry,"rank"=`2022_rank`,"score"=`2022_rq`,sort=T) %>%
+ arrange(-year)
+
+##################
+
+# mapping = aes(x = fct_reorder(x,-y), y = y, fill = y, color = y, label = y)
+
+rank_plot <- function(data,mapping) {
+ data %>%
+ ggplot(mapping)+ # aes(x=fct_reorder(x,-y),y=y)
+ geom_col(width =0.3, # aes(fill=rank)
+ show.legend = F)+
+ geom_text(hjust=0,fontface="bold", # aes(label=rank,color=rank),
+ show.legend = F)+
+ scale_y_discrete(expand = c(0, 0, .5, 0))+
+ coord_flip()+
+ ggthemes::scale_fill_continuous_tableau(palette = "Green-Gold")+
+ ggthemes::scale_color_continuous_tableau(palette = "Green-Gold")+
+ labs(title="",
+ x="",y="")+
+ theme(axis.text.x = element_blank(),
+ axis.text.y = element_text(face="bold"),
+ axis.ticks.x = element_blank(),
+ axis.ticks.y = element_line(size=2),
+ panel.grid.major.x = element_blank(),
+ panel.grid.minor.x = element_blank(),
+ panel.grid.major.y = element_line(size=2),
+ plot.background = element_rect(color="grey95",fill="grey95"),
+ panel.background = element_rect(color="grey92",fill="grey92"))
+}
+
+df<-full%>%
+ filter(year==2017,
+ industry=="Retail")
+
+rank_plot(data = df,
+ mapping = aes(x=fct_reorder(company,-rank),y=rank,
+ fill = rank, color = rank, label = rank))
+
diff --git a/general-housekeeping-items-1.html b/general-housekeeping-items-1.html
index c6affb7c..11edf76a 100644
--- a/general-housekeeping-items-1.html
+++ b/general-housekeeping-items-1.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/general-housekeeping-items.html b/general-housekeeping-items.html
index 442f3b17..04f534ed 100644
--- a/general-housekeeping-items.html
+++ b/general-housekeeping-items.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/geoms-1.html b/geoms-1.html
index 07bd28a0..26926d77 100644
--- a/geoms-1.html
+++ b/geoms-1.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/geoms-2.html b/geoms-2.html
index cc30ff9f..cd0004e0 100644
--- a/geoms-2.html
+++ b/geoms-2.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -612,7 +618,7 @@
19.4 Geomsprint(GeomSpoke$setup_data)
+
<ggproto method>
<Wrapper function>
function (...)
@@ -631,7 +637,7 @@ 19.4 Geomsprint(GeomSmooth$draw_group)
+
<ggproto method>
<Wrapper function>
function (...)
diff --git a/geoms.html b/geoms.html
index 0d33d87c..65c6d327 100644
--- a/geoms.html
+++ b/geoms.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/grammar-of-graphics.html b/grammar-of-graphics.html
index 3bddc3e8..1bdba263 100644
--- a/grammar-of-graphics.html
+++ b/grammar-of-graphics.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/hi-my-name-is.html b/hi-my-name-is.html
index 8cc1200e..0d1a57e6 100644
--- a/hi-my-name-is.html
+++ b/hi-my-name-is.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/histogram.html b/histograms-geom_histogram.html
similarity index 92%
rename from histogram.html
rename to histograms-geom_histogram.html
index 9e699281..6bf8ca83 100644
--- a/histogram.html
+++ b/histograms-geom_histogram.html
@@ -4,18 +4,18 @@
-
2.3 Histogram: | ggplot2 Book Club
+ 2.7 Histograms: geom_histogram() | ggplot2 Book Club
-
+
-
+
@@ -23,15 +23,15 @@
-
+
-
-
+
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -594,9 +600,10 @@
diff --git a/identity-scales.html b/identity-scales.html
index bb95096d..3ee6fb67 100644
--- a/identity-scales.html
+++ b/identity-scales.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/index.html b/index.html
index 43a2c89a..9dfd771c 100644
--- a/index.html
+++ b/index.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -597,7 +603,7 @@
ggplot2 Book Club
-2024-08-01
+2024-08-06
Welcome
diff --git a/individual-geoms.html b/individual-geoms.html
index 5f6c88a2..94eccd3e 100644
--- a/individual-geoms.html
+++ b/individual-geoms.html
@@ -23,7 +23,7 @@
-
+
@@ -31,7 +31,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -596,11 +602,11 @@
Chapter 2 Individual Geoms
+Learning objectives
-- Geoms are the fundamental building blocks of ggplot2.
-- Most of the geoms are associated with a named plot.
-- Some geoms can be added on to low-level geoms to create more complex plots.
-- To find out more about individual geoms see their documentation.
+- Discuss how geoms are the fundamental building blocks of ggplot2.
+- Draw comparisons between geoms and their associated named plot.
+- Explore each individual geom by reviewing their documentation.
@@ -609,7 +615,7 @@ Chapter 2 Individual Geoms
-
+
diff --git a/internals-of-ggplot2.html b/internals-of-ggplot2.html
index 073fbfc6..7a0b0d09 100644
--- a/internals-of-ggplot2.html
+++ b/internals-of-ggplot2.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -603,10 +609,10 @@
Chapter 18 Internals of ggplot2What the division of labor between {ggplot2}
and {grid}
?
- What is the basic structure of/motivation for ggproto?
-library(ggplot2)
-library(ggtrace) # remotes::install_github("yjunechoe/ggtrace")
-library(purrr)
-library(dplyr)
+library(ggplot2)
+library(ggtrace) # remotes::install_github("yjunechoe/ggtrace")
+library(purrr)
+library(dplyr)
18.0.0.1 Introduction (the existence of internals)
The user-facing code that defines a ggplot on the surface is not the same as the internal code that creates a ggplot under the hood. In this chapter, we’ll learn about how the internal code operates and develop some intuitions about thinking about the internals, starting with these two simple examples of mismatches between surface and underlying form:
@@ -615,40 +621,40 @@ 18.0.0.1 Introduction (the existe
18.0.0.2 Case 1: Order
You can change the order of some “layers” without change to the graphical output.
For example, scale_*()
can be added anywhere and always ends up applying for the whole plot:
-ggplot(mtcars, aes(mpg, hp, color = as.factor(am))) +
- scale_x_log10() + #< scale first
- geom_point() +
- geom_smooth()
-
-ggplot(mtcars, aes(mpg, hp, color = as.factor(am))) +
- geom_point() +
- scale_x_log10() + #< scale middle
- geom_smooth()
+ggplot(mtcars, aes(mpg, hp, color = as.factor(am))) +
+ scale_x_log10() + #< scale first
+ geom_point() +
+ geom_smooth()
+
+ggplot(mtcars, aes(mpg, hp, color = as.factor(am))) +
+ geom_point() +
+ scale_x_log10() + #< scale middle
+ geom_smooth()
Though the order of geom_*()
and stat_*()
matters for order of drawing:
-
+
18.0.0.3 Case 2: Modularity
We know that user-facing “layer” code that we add to a ggplot with +
are stand-alone functions:
-
+
geom_smooth: na.rm = FALSE, orientation = NA, se = TRUE
stat_smooth: na.rm = FALSE, orientation = NA, se = TRUE, method = lm, formula = y ~ x
position_identity
When we add this object to different ggplots, it materializes in different ways:
-
+
-
+
diff --git a/introducing-ggproto.html b/introducing-ggproto.html
index d065253d..3bf1854c 100644
--- a/introducing-ggproto.html
+++ b/introducing-ggproto.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -597,47 +603,47 @@
18.4 Introducing ggproto
It’s essentially a list of functions
-String <- list(
- add = function(x, y) paste0(x, y),
- subtract = function(x, y) gsub(y, "", x, fixed = TRUE),
- show = function(x, y) paste0(x, " and ", y)
-)
-Number <- list(
- add = function(x, y) x + y,
- subtract = function(x, y) x - y,
- show = String$show
-)
-
+String <- list(
+ add = function(x, y) paste0(x, y),
+ subtract = function(x, y) gsub(y, "", x, fixed = TRUE),
+ show = function(x, y) paste0(x, " and ", y)
+)
+Number <- list(
+ add = function(x, y) x + y,
+ subtract = function(x, y) x - y,
+ show = String$show
+)
+
[1] "ab"
-
+
[1] "jun"
-
+
[1] "ggplot and bookclub"
-
+
[1] 3
-
+
[1] 5
-
+
[1] "1 and 2"
18.4.1 ggproto syntax
From the book:
-Person <- ggproto("Person", NULL,
- first = "",
- last = "",
- birthdate = NA,
-
- full_name = function(self) {
- paste(self$first, self$last)
- },
- age = function(self) {
- days_old <- Sys.Date() - self$birthdate
- floor(as.integer(days_old) / 365.25)
- },
- description = function(self) {
- paste(self$full_name(), "is", self$age(), "old")
- }
-)
+Person <- ggproto("Person", NULL,
+ first = "",
+ last = "",
+ birthdate = NA,
+
+ full_name = function(self) {
+ paste(self$first, self$last)
+ },
+ age = function(self) {
+ days_old <- Sys.Date() - self$birthdate
+ floor(as.integer(days_old) / 365.25)
+ },
+ description = function(self) {
+ paste(self$full_name(), "is", self$age(), "old")
+ }
+)
18.4.2 ggproto style guide
diff --git a/introduction-1.html b/introduction-1.html
index f45e2649..5de940bb 100644
--- a/introduction-1.html
+++ b/introduction-1.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/introduction-2.html b/introduction-2.html
index b0878dda..e9d81e7f 100644
--- a/introduction-2.html
+++ b/introduction-2.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/introduction-3.html b/introduction-3.html
index 4d0d8a63..a4fb5dad 100644
--- a/introduction-3.html
+++ b/introduction-3.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -596,7 +602,7 @@
7.1 Introduction
-
+
]
Packages
diff --git a/introduction-4.html b/introduction-4.html
index 0f51ac3b..193f30ea 100644
--- a/introduction-4.html
+++ b/introduction-4.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/introduction-5.html b/introduction-5.html
index 15895820..3d6b0d85 100644
--- a/introduction-5.html
+++ b/introduction-5.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -605,9 +611,9 @@
14.1 Introductionlibrary(tidyverse)
-library(patchwork)
-iris %>% head()
+
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
diff --git a/introduction-6.html b/introduction-6.html
index 01afc5c3..de5ed45c 100644
--- a/introduction-6.html
+++ b/introduction-6.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/introduction-preliminaries-asides.html b/introduction-preliminaries-asides.html
index 909245f5..98f8f3c8 100644
--- a/introduction-preliminaries-asides.html
+++ b/introduction-preliminaries-asides.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/introduction.html b/introduction.html
index a3064b78..2663d423 100644
--- a/introduction.html
+++ b/introduction.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/learning-objectives-1.html b/learning-objectives-1.html
index a5899cd3..7584dac7 100644
--- a/learning-objectives-1.html
+++ b/learning-objectives-1.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/learning-objectives.html b/learning-objectives.html
index e381300c..775b98ca 100644
--- a/learning-objectives.html
+++ b/learning-objectives.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/legend-key-glyphs.html b/legend-key-glyphs.html
index 9fe2a563..7024124d 100644
--- a/legend-key-glyphs.html
+++ b/legend-key-glyphs.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/legend-merging-and-splitting.html b/legend-merging-and-splitting.html
index 7fcd82bf..c53373a4 100644
--- a/legend-merging-and-splitting.html
+++ b/legend-merging-and-splitting.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/legend-position.html b/legend-position.html
index 081015c8..75b550da 100644
--- a/legend-position.html
+++ b/legend-position.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/what-low-level-geoms-are-used-to-draw-geom_smooth.html b/line-chart-geom_line.html
similarity index 90%
rename from what-low-level-geoms-are-used-to-draw-geom_smooth.html
rename to line-chart-geom_line.html
index 6d1d1caf..44d769a4 100644
--- a/what-low-level-geoms-are-used-to-draw-geom_smooth.html
+++ b/line-chart-geom_line.html
@@ -4,18 +4,18 @@
-
2.7 What low-level geoms are used to draw geom_smooth()? | ggplot2 Book Club
+ 2.4 Line chart: geom_line() | ggplot2 Book Club
-
+
-
+
@@ -23,15 +23,15 @@
-
+
-
-
+
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -594,22 +600,40 @@
-
-2.7 What low-level geoms are used to draw geom_smooth()?
-Geom_smooth() fits a smoother to data, displaying the smooth and its standard error, allowing you to see a dominant pattern within a scatterplot with a lot of “noise”. The low level geom for geom_smooth() are geom_path(), geom_area() and geom_point().
-
-## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
-
+
+2.4 Line chart: geom_line()
+
+- A geom that connects points from left to right.
+
+linetype
is a useful parameter.
+- Checkout the different linetypes here.
+- Also here
?linetype
+
+
+
+
+
+- What’s up with
geom_path()
?
+
+- Connects points as they appear in order of the data
+- Answer to exercise 2.
+
+
+
+
+
+
-
-
+
+
diff --git a/line-plot.html b/line-plot.html
deleted file mode 100644
index b39b2b48..00000000
--- a/line-plot.html
+++ /dev/null
@@ -1,665 +0,0 @@
-
-
-
-
-
-
- 2.2 Line plot: | ggplot2 Book Club
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
diff --git a/line-type.html b/line-type.html
index decd6905..6fd95892 100644
--- a/line-type.html
+++ b/line-type.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/linear-coordinate-systems.html b/linear-coordinate-systems.html
index 25e845b1..cdab9d82 100644
--- a/linear-coordinate-systems.html
+++ b/linear-coordinate-systems.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -605,39 +611,39 @@
14.2 Linear coordinate systems
coord_cartesian()
-p1 <- ggplot(iris, aes(Sepal.Length, Sepal.Width)) +
- geom_point(aes(fill=Species),
- show.legend = F,
- shape=21,color="grey20",alpha=0.5) +
- geom_smooth(color="pink") +
- theme_light()
-
-p1 | p1 + scale_x_continuous(limits = c(5, 6)) | p1 + coord_cartesian(xlim = c(5, 6))
+p1 <- ggplot(iris, aes(Sepal.Length, Sepal.Width)) +
+ geom_point(aes(fill=Species),
+ show.legend = F,
+ shape=21,color="grey20",alpha=0.5) +
+ geom_smooth(color="pink") +
+ theme_light()
+
+p1 | p1 + scale_x_continuous(limits = c(5, 6)) | p1 + coord_cartesian(xlim = c(5, 6))
coord_flip()
-p2 <- ggplot(iris, aes(Sepal.Length, Sepal.Width)) +
- geom_point(aes(fill=Species),
- show.legend = F,
- shape=21,color="grey20",alpha=0.5) +
- geom_smooth(color="pink") +
- theme_light()
-
-p3 <- ggplot(iris, aes(Sepal.Width,Sepal.Length)) +
- geom_point(aes(fill=Species),
- show.legend = F,
- shape=21,color="grey20",alpha=0.5) +
- geom_smooth(color="pink") +
- theme_light()
-
-p2 | p2 + coord_flip() | p3
+p2 <- ggplot(iris, aes(Sepal.Length, Sepal.Width)) +
+ geom_point(aes(fill=Species),
+ show.legend = F,
+ shape=21,color="grey20",alpha=0.5) +
+ geom_smooth(color="pink") +
+ theme_light()
+
+p3 <- ggplot(iris, aes(Sepal.Width,Sepal.Length)) +
+ geom_point(aes(fill=Species),
+ show.legend = F,
+ shape=21,color="grey20",alpha=0.5) +
+ geom_smooth(color="pink") +
+ theme_light()
+
+p2 | p2 + coord_flip() | p3
(the smooth is fit to the rotated data).
coord_fixed()
-
+
diff --git a/main-data-set.html b/main-data-set.html
index 981e9523..48bdbdb3 100644
--- a/main-data-set.html
+++ b/main-data-set.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/manual-scales-1.html b/manual-scales-1.html
index 83e79d28..4e249bcc 100644
--- a/manual-scales-1.html
+++ b/manual-scales-1.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/map-projections.html b/map-projections.html
index bbfa7978..3e1de9a6 100644
--- a/map-projections.html
+++ b/map-projections.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/mapping-components.html b/mapping-components.html
index 10f8323b..06a473a5 100644
--- a/mapping-components.html
+++ b/mapping-components.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/maps.html b/maps.html
index 9819d9fe..db9601e6 100644
--- a/maps.html
+++ b/maps.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/mastering-the-grammar.html b/mastering-the-grammar.html
index 5578ce93..fd750955 100644
--- a/mastering-the-grammar.html
+++ b/mastering-the-grammar.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/meeting-videos-1.html b/meeting-videos-1.html
index 3dc7123e..81e7f1ad 100644
--- a/meeting-videos-1.html
+++ b/meeting-videos-1.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/meeting-videos-10.html b/meeting-videos-10.html
index d0fe7c93..9f7ba278 100644
--- a/meeting-videos-10.html
+++ b/meeting-videos-10.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/meeting-videos-11.html b/meeting-videos-11.html
index 6698b018..2ff2391a 100644
--- a/meeting-videos-11.html
+++ b/meeting-videos-11.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/meeting-videos-12.html b/meeting-videos-12.html
index a2790a22..4ae5c207 100644
--- a/meeting-videos-12.html
+++ b/meeting-videos-12.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/meeting-videos-13.html b/meeting-videos-13.html
index 369dc94d..fd6a8c74 100644
--- a/meeting-videos-13.html
+++ b/meeting-videos-13.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/meeting-videos-14.html b/meeting-videos-14.html
index c5369941..6a11f2c2 100644
--- a/meeting-videos-14.html
+++ b/meeting-videos-14.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/meeting-videos-15.html b/meeting-videos-15.html
index d4b61532..b4b96190 100644
--- a/meeting-videos-15.html
+++ b/meeting-videos-15.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/meeting-videos-16.html b/meeting-videos-16.html
index 429ab2ff..e0d69eec 100644
--- a/meeting-videos-16.html
+++ b/meeting-videos-16.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/meeting-videos-17.html b/meeting-videos-17.html
index 94ce1c88..3c556b39 100644
--- a/meeting-videos-17.html
+++ b/meeting-videos-17.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/meeting-videos-18.html b/meeting-videos-18.html
index 3aeb21c9..09039d22 100644
--- a/meeting-videos-18.html
+++ b/meeting-videos-18.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/meeting-videos-19.html b/meeting-videos-19.html
index c5181763..505a0a06 100644
--- a/meeting-videos-19.html
+++ b/meeting-videos-19.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/meeting-videos-2.html b/meeting-videos-2.html
index 5249959c..84382b2e 100644
--- a/meeting-videos-2.html
+++ b/meeting-videos-2.html
@@ -4,18 +4,18 @@
-
2.10 Meeting Videos | ggplot2 Book Club
+ 2.11 Meeting Videos | ggplot2 Book Club
-
+
-
+
@@ -23,14 +23,14 @@
-
+
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -594,10 +600,10 @@
-
-2.10 Meeting Videos
-
-2.10.1 Cohort 1
+
diff --git a/meeting-videos-20.html b/meeting-videos-20.html
index a9daf1e3..2c596383 100644
--- a/meeting-videos-20.html
+++ b/meeting-videos-20.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/meeting-videos-21.html b/meeting-videos-21.html
index 05faa19c..5ec786fc 100644
--- a/meeting-videos-21.html
+++ b/meeting-videos-21.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/meeting-videos-3.html b/meeting-videos-3.html
index 00b1a6f3..e9b96eeb 100644
--- a/meeting-videos-3.html
+++ b/meeting-videos-3.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/meeting-videos-4.html b/meeting-videos-4.html
index ebff9c7e..fef32b41 100644
--- a/meeting-videos-4.html
+++ b/meeting-videos-4.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/meeting-videos-5.html b/meeting-videos-5.html
index b960047e..1b44f85b 100644
--- a/meeting-videos-5.html
+++ b/meeting-videos-5.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/meeting-videos-6.html b/meeting-videos-6.html
index a5c4dcb2..100241e7 100644
--- a/meeting-videos-6.html
+++ b/meeting-videos-6.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/meeting-videos-7.html b/meeting-videos-7.html
index 194f8beb..ae310e59 100644
--- a/meeting-videos-7.html
+++ b/meeting-videos-7.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/meeting-videos-8.html b/meeting-videos-8.html
index 4f0e3ff0..ce790459 100644
--- a/meeting-videos-8.html
+++ b/meeting-videos-8.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/meeting-videos-9.html b/meeting-videos-9.html
index 518f4c98..341fa372 100644
--- a/meeting-videos-9.html
+++ b/meeting-videos-9.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/meeting-videos.html b/meeting-videos.html
index bcbd26b8..67e31151 100644
--- a/meeting-videos.html
+++ b/meeting-videos.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/modifying-the-axes.html b/modifying-the-axes.html
index 7ff540f0..6d171cfa 100644
--- a/modifying-the-axes.html
+++ b/modifying-the-axes.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -623,7 +629,7 @@
1.9 Modifying the Axes geom_jitter(width = 0.25) +
xlim("f", "r") +
ylim(20, 30)
-## Warning: Removed 138 rows containing missing values or values outside the scale range
+## Warning: Removed 139 rows containing missing values or values outside the scale range
## (`geom_point()`).
diff --git a/networks.html b/networks.html
index 398edefe..050cbf82 100644
--- a/networks.html
+++ b/networks.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/non-linear-coordinate-systems.html b/non-linear-coordinate-systems.html
index 00c00790..a17e969c 100644
--- a/non-linear-coordinate-systems.html
+++ b/non-linear-coordinate-systems.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -605,123 +611,123 @@
14.3 Non-linear coordinate system
coord_polar()
-
+
14.3.1 Example: Coord_polar() with DuBoisChallenge N°8 data
source: DuBois data portraits
-df <- read_csv("https://raw.githubusercontent.com/ajstarks/dubois-data-portraits/master/challenge/2022/challenge08/data.csv")
-
-df2 <- df %>%
- arrange(-Year)
-
-df2[7,1] <- 1875
-df2[7,2] <- 0
-df2[7,3] <- 0
-df2 %>%
- ggplot() +
-
- geom_line(data= subset(df2, Year %in% c(1875,1875)),
- mapping = aes(x=Year, y= `Houshold Value (Dollars)`),
- color="#FFCDCB",size=6) +
-
- geom_line(data= subset(df2, Year%in%c(1875,1875,1880)),
- mapping= aes(x=Year +2, y= `Houshold Value (Dollars)`),
- color="#989EB4",size=6) +
-
- geom_line(data= subset(df2, Year%in%c(1875,1875,1880,1885)),
- mapping= aes(x=Year +4, y= `Houshold Value (Dollars)`),
- color="#b08c71",size=6) +
-
- geom_line(data= subset(df2, Year%in%c(1875,1875,1880,1885,1890)),
- mapping= aes(x=Year +6, y= `Houshold Value (Dollars)`),
- color="#FFC942",size=6) +
-
- geom_line(data= subset(df2, Year%in%c(1875,1875,1880,1885,1890,1895)),
- mapping= aes(x=Year +8, y= `Houshold Value (Dollars)`),
- color="#EFDECC", size=6) +
-
- geom_line(mapping= aes(x=Year +10, y= `Houshold Value (Dollars)`),
- color="#F02C49",size=6) +
-
- coord_polar(theta = "y",
- start = 0,
- direction = 1,
- clip = "off") +
-
- # other scales that can be used:
- #scale_x_reverse(expand=expansion(mult=c(-0.9,-0.1),add=c(29,-0.1))) +
- #scale_y_continuous(expand=expansion(mult=c(0.09,0.01),add=c(0,-790000))) +
-
- scale_x_reverse(expand=expansion(add=c(11,-5))) +
- scale_y_continuous(expand=expansion(add=c(0,-600000))) +
- labs(title="ASSESSED VALUE OF HOUSEHOLD AND KITCHEN FURNITURE
- OWNED BY GEORGIA NEGROES.")+
- theme_void() +
- theme(text = element_text(face="bold",
- color="grey27"),
- aspect.ratio =2/1.9, #y/x
- plot.background = element_rect(color= "#d9ccbf", fill= "#d9ccbf"),
- plot.title = element_text(hjust=0.5,size=9))
+df <- read_csv("https://raw.githubusercontent.com/ajstarks/dubois-data-portraits/master/challenge/2022/challenge08/data.csv")
+
+df2 <- df %>%
+ arrange(-Year)
+
+df2[7,1] <- 1875
+df2[7,2] <- 0
+df2[7,3] <- 0
+df2 %>%
+ ggplot() +
+
+ geom_line(data= subset(df2, Year %in% c(1875,1875)),
+ mapping = aes(x=Year, y= `Houshold Value (Dollars)`),
+ color="#FFCDCB",size=6) +
+
+ geom_line(data= subset(df2, Year%in%c(1875,1875,1880)),
+ mapping= aes(x=Year +2, y= `Houshold Value (Dollars)`),
+ color="#989EB4",size=6) +
+
+ geom_line(data= subset(df2, Year%in%c(1875,1875,1880,1885)),
+ mapping= aes(x=Year +4, y= `Houshold Value (Dollars)`),
+ color="#b08c71",size=6) +
+
+ geom_line(data= subset(df2, Year%in%c(1875,1875,1880,1885,1890)),
+ mapping= aes(x=Year +6, y= `Houshold Value (Dollars)`),
+ color="#FFC942",size=6) +
+
+ geom_line(data= subset(df2, Year%in%c(1875,1875,1880,1885,1890,1895)),
+ mapping= aes(x=Year +8, y= `Houshold Value (Dollars)`),
+ color="#EFDECC", size=6) +
+
+ geom_line(mapping= aes(x=Year +10, y= `Houshold Value (Dollars)`),
+ color="#F02C49",size=6) +
+
+ coord_polar(theta = "y",
+ start = 0,
+ direction = 1,
+ clip = "off") +
+
+ # other scales that can be used:
+ #scale_x_reverse(expand=expansion(mult=c(-0.9,-0.1),add=c(29,-0.1))) +
+ #scale_y_continuous(expand=expansion(mult=c(0.09,0.01),add=c(0,-790000))) +
+
+ scale_x_reverse(expand=expansion(add=c(11,-5))) +
+ scale_y_continuous(expand=expansion(add=c(0,-600000))) +
+ labs(title="ASSESSED VALUE OF HOUSEHOLD AND KITCHEN FURNITURE
+ OWNED BY GEORGIA NEGROES.")+
+ theme_void() +
+ theme(text = element_text(face="bold",
+ color="grey27"),
+ aspect.ratio =2/1.9, #y/x
+ plot.background = element_rect(color= "#d9ccbf", fill= "#d9ccbf"),
+ plot.title = element_text(hjust=0.5,size=9))
coord_trans()
-rect <- data.frame(x = 50, y = 50)
-line <- data.frame(x = c(1, 200), y = c(100, 1))
-p6 <- ggplot(mapping = aes(x, y)) +
- geom_tile(data = rect, aes(width = 50, height = 50)) +
- geom_line(data = line) +
- xlab(NULL) + ylab(NULL)
-
-p6
+rect <- data.frame(x = 50, y = 50)
+line <- data.frame(x = c(1, 200), y = c(100, 1))
+p6 <- ggplot(mapping = aes(x, y)) +
+ geom_tile(data = rect, aes(width = 50, height = 50)) +
+ geom_line(data = line) +
+ xlab(NULL) + ylab(NULL)
+
+p6
-
+
-p7 <- ggplot(iris, aes(Sepal.Length, Petal.Length)) +
- stat_bin2d() +
- geom_smooth(method = "lm") +
- xlab(NULL) +
- ylab(NULL) +
- theme(legend.position = "none")
-p7
+p7 <- ggplot(iris, aes(Sepal.Length, Petal.Length)) +
+ stat_bin2d() +
+ geom_smooth(method = "lm") +
+ xlab(NULL) +
+ ylab(NULL) +
+ theme(legend.position = "none")
+p7
-#> `geom_smooth()` using formula 'y ~ x'
-
-# Better fit on log scale, but harder to interpret
-p7 +
- scale_x_log10() +
- scale_y_log10()
+#> `geom_smooth()` using formula 'y ~ x'
+
+# Better fit on log scale, but harder to interpret
+p7 +
+ scale_x_log10() +
+ scale_y_log10()
-#> `geom_smooth()` using formula 'y ~ x'
-
-# Fit on log scale, then backtransform to original.
-# Highlights lack of expensive diamonds with large carats
-pow10 <- scales::exp_trans(10)
-p7 +
- scale_x_log10() +
- scale_y_log10() +
- coord_trans(x = pow10, y = pow10)
+#> `geom_smooth()` using formula 'y ~ x'
+
+# Fit on log scale, then backtransform to original.
+# Highlights lack of expensive diamonds with large carats
+pow10 <- scales::exp_trans(10)
+p7 +
+ scale_x_log10() +
+ scale_y_log10() +
+ coord_trans(x = pow10, y = pow10)
coord_map()
/coord_quickmap()
/coord_sf()
-world <- map_data("world")
-worldmap <- ggplot(world, aes(long, lat, group = group)) +
- geom_path() +
- scale_y_continuous(NULL, breaks = (-2:3) * 30, labels = NULL) +
- scale_x_continuous(NULL, breaks = (-4:4) * 45, labels = NULL)
-
-
-worldmap + coord_quickmap() |
-worldmap + coord_map("ortho") |
-worldmap + coord_map("stereographic")
+world <- map_data("world")
+worldmap <- ggplot(world, aes(long, lat, group = group)) +
+ geom_path() +
+ scale_y_continuous(NULL, breaks = (-2:3) * 30, labels = NULL) +
+ scale_x_continuous(NULL, breaks = (-4:4) * 45, labels = NULL)
+
+
+worldmap + coord_quickmap() |
+worldmap + coord_map("ortho") |
+worldmap + coord_map("stereographic")
diff --git a/numeric.html b/numeric.html
index 471f632d..df247e0a 100644
--- a/numeric.html
+++ b/numeric.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -782,7 +788,7 @@
9.2.10 ASIDE - A little more on t
## d_transform = function(x) rep(-1, length(x)), d_inverse = function(x) rep(-1,
## length(x)), minor_breaks = regular_minor_breaks(reverse = TRUE))
## }
-## <bytecode: 0x5567abae6290>
+## <bytecode: 0x55d92b830ed0>
## <environment: namespace:scales>
## List of 9
## $ name : chr "reverse"
diff --git a/other-aesthetic-attributes.html b/other-aesthetic-attributes.html
index 0081cec1..d3c2a9c3 100644
--- a/other-aesthetic-attributes.html
+++ b/other-aesthetic-attributes.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/other-aesthetics.html b/other-aesthetics.html
index 978736b3..4c10c7e9 100644
--- a/other-aesthetics.html
+++ b/other-aesthetics.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/other-important-parts.html b/other-important-parts.html
index 1a116168..41d46f24 100644
--- a/other-important-parts.html
+++ b/other-important-parts.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/output.html b/output.html
index c48638d4..e05ce12a 100644
--- a/output.html
+++ b/output.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/overview.html b/overview.html
index 599b69e6..56b66c06 100644
--- a/overview.html
+++ b/overview.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/plot-and-axis-titles.html b/plot-and-axis-titles.html
index bd5aa8a1..7510db93 100644
--- a/plot-and-axis-titles.html
+++ b/plot-and-axis-titles.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -596,22 +602,22 @@
7.2 Plot and Axis Titles
-base <- ggplot(penguins, aes(bill_length_mm, bill_depth_mm,
- color = species, shape = species)) +
- geom_point(alpha = .4) +
- geom_point(data = gd, size = 4) +
-theme_bw() +
-labs(
-title = "How does Bill Size Differ by species?",
-subtitle = "Source: Palmer Station Antarctica LTER and K. Gorman, 2020",
-x = "*Length*",
-y = "Width",
-caption = "ggplot 2 Book Club") +
-theme(plot.title = element_text(color = "midnightblue",
- hjust = .5, face = "bold")) +
-theme(plot.subtitle = element_text(hjust = .5, size = 9)) +
-theme(axis.title.x = ggtext::element_markdown())
-
+base <- ggplot(penguins, aes(bill_length_mm, bill_depth_mm,
+ color = species, shape = species)) +
+ geom_point(alpha = .4) +
+ geom_point(data = gd, size = 4) +
+theme_bw() +
+labs(
+title = "How does Bill Size Differ by species?",
+subtitle = "Source: Palmer Station Antarctica LTER and K. Gorman, 2020",
+x = "*Length*",
+y = "Width",
+caption = "ggplot 2 Book Club") +
+theme(plot.title = element_text(color = "midnightblue",
+ hjust = .5, face = "bold")) +
+theme(plot.subtitle = element_text(hjust = .5, size = 9)) +
+theme(axis.title.x = ggtext::element_markdown())
+
- line breaks
- quote() for mathamatical expressions. ?plotmath
diff --git a/plot-elements-of-a-theme.html b/plot-elements-of-a-theme.html
index 09796c5a..bcd46a65 100644
--- a/plot-elements-of-a-theme.html
+++ b/plot-elements-of-a-theme.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/polygon-maps.html b/polygon-maps.html
index 996d9892..d75a3642 100644
--- a/polygon-maps.html
+++ b/polygon-maps.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -597,10 +603,10 @@
5.1 Polygon Maps
The simplest approach to mapping is using geom_polygon(). This forms bounderies around regions.
-library(ggplot2)
-mi_counties <- map_data("county", "michigan") %>%
- select(lon = long, lat, group, id = subregion)
-head(mi_counties)
+library(ggplot2)
+mi_counties <- map_data("county", "michigan") %>%
+ select(lon = long, lat, group, id = subregion)
+head(mi_counties)
## lon lat group id
## 1 -83.88675 44.85686 1 alcona
## 2 -83.36536 44.86832 1 alcona
@@ -613,13 +619,13 @@ 5.1 Polygon Mapsggplot(mi_counties, aes(lon, lat)) +
- geom_point(size = .25, show.legend = FALSE) +
- coord_quickmap()
+
-ggplot(mi_counties, aes(lon, lat, group = group)) +
- geom_polygon(fill = "white", colour = "grey50") +
- coord_quickmap()
+ggplot(mi_counties, aes(lon, lat, group = group)) +
+ geom_polygon(fill = "white", colour = "grey50") +
+ coord_quickmap()
In this plot, coord_quickmap()
is used to adjust the axes to ensure longitude and latitude are rendered on the same scale
For a more advanced use of ggplot2 for mapping, we’ll see the use of geom_sf()
and coord_sf()
to handle spatial data specified in simple features format.
diff --git a/what-low-level-geoms-are-used-to-draw-geom_violin.html b/polygons-geom_polygon.html
similarity index 92%
rename from what-low-level-geoms-are-used-to-draw-geom_violin.html
rename to polygons-geom_polygon.html
index 9fa85e64..8ee34c9a 100644
--- a/what-low-level-geoms-are-used-to-draw-geom_violin.html
+++ b/polygons-geom_polygon.html
@@ -4,18 +4,18 @@
- 2.9 What low-level geoms are used to draw geom_violin()? | ggplot2 Book Club
+ 2.6 Polygons: geom_polygon() | ggplot2 Book Club
-
+
-
+
@@ -23,15 +23,15 @@
-
+
-
-
+
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -594,19 +600,23 @@
-
-2.9 What low-level geoms are used to draw geom_violin()?
-Violin plots show a compact representation of the density of the distribution highlighting the areas where most of the points are found. The low level geom for geom_violin() are geom_area() and geom_path().
-
-
+
+2.6 Polygons: geom_polygon()
+
+- Draws polygons, which are filled paths.
+- Useful when making maps: more in Chapter 6.
+
+
+
-
-
+
+
diff --git a/position-adjustments.html b/position-adjustments.html
index 7b4de8f3..be23daea 100644
--- a/position-adjustments.html
+++ b/position-adjustments.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/position-scales-and-axes.html b/position-scales-and-axes.html
index 0bee5715..1b8d684c 100644
--- a/position-scales-and-axes.html
+++ b/position-scales-and-axes.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/prerequisites.html b/prerequisites.html
index a6e6026a..336b458e 100644
--- a/prerequisites.html
+++ b/prerequisites.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/present-a-chapter.html b/present-a-chapter.html
index 5ab96ea8..c8ee8ceb 100644
--- a/present-a-chapter.html
+++ b/present-a-chapter.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/process-and-examples.html b/process-and-examples.html
index fb2c7911..e1801c40 100644
--- a/process-and-examples.html
+++ b/process-and-examples.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/programming-single-and-multiple-components.html b/programming-single-and-multiple-components.html
index 8b80cf7e..0593fe77 100644
--- a/programming-single-and-multiple-components.html
+++ b/programming-single-and-multiple-components.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -601,39 +607,39 @@
17.1 Programming single and multi
17.1.1 Components
One example of a component of a plot is this one below:
-bestfit <- geom_smooth(
- method = "lm",
- se = FALSE,
- colour = alpha("steelblue", 0.5),
- size = 2)
+bestfit <- geom_smooth(
+ method = "lm",
+ se = FALSE,
+ colour = alpha("steelblue", 0.5),
+ size = 2)
This single component can be placed inside the syntax of the grammar of graphics and used as a plot layer.
-
-
+
+
Another way is to bulid a layer passing through build a function:
-geom_lm <- function(formula = y ~ x, colour = alpha("steelblue", 0.5),
- size = 2, ...) {
- geom_smooth(formula = formula, se = FALSE, method = "lm", colour = colour,
- size = size, ...)
-}
+geom_lm <- function(formula = y ~ x, colour = alpha("steelblue", 0.5),
+ size = 2, ...) {
+ geom_smooth(formula = formula, se = FALSE, method = "lm", colour = colour,
+ size = size, ...)
+}
And the apply the function layer to the plot
-ggplot(mpg, aes(displ, 1 / hwy)) +
- geom_point() +
- geom_lm(y ~ poly(x, 2), size = 1, colour = "red")
-
+ggplot(mpg, aes(displ, 1 / hwy)) +
+ geom_point() +
+ geom_lm(y ~ poly(x, 2), size = 1, colour = "red")
+
The book points out attention to the “open” parameter ….
A suggestion is to use it inside the function instead of in the function parameters definition.
Instead of only one component, we can build a plot made of more components.
-geom_mean <- function() {
- list(
- stat_summary(fun = "mean", geom = "bar", fill = "grey70"),
- stat_summary(fun.data = "mean_cl_normal", geom = "errorbar", width = 0.4)
- )
-}
+geom_mean <- function() {
+ list(
+ stat_summary(fun = "mean", geom = "bar", fill = "grey70"),
+ stat_summary(fun.data = "mean_cl_normal", geom = "errorbar", width = 0.4)
+ )
+}
Whit this result:
-
-
+
+
diff --git a/programming-with-ggplot2.html b/programming-with-ggplot2.html
index 3f6407f1..b68dbab2 100644
--- a/programming-with-ggplot2.html
+++ b/programming-with-ggplot2.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/quick-intuition-on-collective-geoms.html b/quick-intuition-on-collective-geoms.html
index 02151c13..09801930 100644
--- a/quick-intuition-on-collective-geoms.html
+++ b/quick-intuition-on-collective-geoms.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -613,41 +619,41 @@
3.3 Quick Intuition on Collective
This blog post by Simon Jackson illustrates these foundations using mtcars
. The points are individual geoms and the bars are a collective geom showing the average of the individual observations.
-id <- mtcars %>%
- tibble::rownames_to_column() %>%
- as_tibble() %>%
- mutate(am = factor(am, levels = c(0, 1), labels = c("automatic", "manual")))
-
-gd <- id %>%
- group_by(am) %>%
- summarise(hp = mean(hp))
-
-ggplot(id, aes(x = am, y = hp, color = am, fill = am)) +
- geom_bar(data = gd, stat = "identity", alpha = 0.3) +
- ggrepel::geom_text_repel(aes(label = rowname), color = "black", size = 2.5, segment.color = "grey") +
- geom_point() +
- guides(color = "none", fill = "none") +
- theme_bw() +
- labs(
- title = "Car horespower by transmission type",
- x = "Transmission",
- y = "Horsepower"
- )
+id <- mtcars %>%
+ tibble::rownames_to_column() %>%
+ as_tibble() %>%
+ mutate(am = factor(am, levels = c(0, 1), labels = c("automatic", "manual")))
+
+gd <- id %>%
+ group_by(am) %>%
+ summarise(hp = mean(hp))
+
+ggplot(id, aes(x = am, y = hp, color = am, fill = am)) +
+ geom_bar(data = gd, stat = "identity", alpha = 0.3) +
+ ggrepel::geom_text_repel(aes(label = rowname), color = "black", size = 2.5, segment.color = "grey") +
+ geom_point() +
+ guides(color = "none", fill = "none") +
+ theme_bw() +
+ labs(
+ title = "Car horespower by transmission type",
+ x = "Transmission",
+ y = "Horsepower"
+ )
Next, a separate longitudinal study from the blog post (because the book example is also a longitudinal study). This example uses the ourworldindata
dataset which shows healthcare spending per country over time.
-#library(devtools)
-#install_github("drsimonj/ourworldindata")
-
-library(ourworldindata)
-
-id <- financing_healthcare %>%
- filter(continent %in% c("Oceania", "Europe") & between(year, 2001, 2005)) %>%
- select(continent, country, year, health_exp_total) %>%
- na.omit()
+#library(devtools)
+#install_github("drsimonj/ourworldindata")
+
+library(ourworldindata)
+
+id <- financing_healthcare %>%
+ filter(continent %in% c("Oceania", "Europe") & between(year, 2001, 2005)) %>%
+ select(continent, country, year, health_exp_total) %>%
+ na.omit()
- raw data
-
+
## # A tibble: 275 × 4
## continent country year health_exp_total
## <chr> <chr> <int> <dbl>
@@ -665,21 +671,21 @@ 3.3 Quick Intuition on Collective
- individual observations are at the combined country-year level. For the purposes of plotting, though, the “individual geom” will just be the country and all of the yearly observations for each country.
-gd <- id %>%
- group_by(continent, year) %>%
- summarise(health_exp_total = mean(health_exp_total))
-
-
-ggplot(id, aes(x = year, y = health_exp_total, color = continent)) +
- geom_line(aes(group = country), alpha = 0.3) +
- geom_line(data = gd, alpha = 0.8, size = 3) +
- theme_bw() +
- labs(
- title = "Changes in healthcare spending\nacross countries and world regions",
- x = NULL,
- y = "Total healthcare investment ($)",
- color = NULL
- )
+gd <- id %>%
+ group_by(continent, year) %>%
+ summarise(health_exp_total = mean(health_exp_total))
+
+
+ggplot(id, aes(x = year, y = health_exp_total, color = continent)) +
+ geom_line(aes(group = country), alpha = 0.3) +
+ geom_line(data = gd, alpha = 0.8, size = 3) +
+ theme_bw() +
+ labs(
+ title = "Changes in healthcare spending\nacross countries and world regions",
+ x = NULL,
+ y = "Total healthcare investment ($)",
+ color = NULL
+ )
diff --git a/raster-maps.html b/raster-maps.html
index 7763559a..ef82a2dc 100644
--- a/raster-maps.html
+++ b/raster-maps.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/reference-keys.txt b/reference-keys.txt
index 5672bb43..268e04f6 100644
--- a/reference-keys.txt
+++ b/reference-keys.txt
@@ -14,15 +14,19 @@ output
meeting-videos-1
cohort-1-1
individual-geoms
-scatterplot
-line-plot
-histogram
-bar-chart
-geom_polygon-draws-polygons-which-are-filled-paths.
-geom_line-connects-points-from-left-to-right.
-what-low-level-geoms-are-used-to-draw-geom_smooth
-what-low-level-geoms-are-used-to-draw-geom_boxplot
-what-low-level-geoms-are-used-to-draw-geom_violin
+the-basics
+area-chart-geom_area
+bar-chart-geom_bar
+line-chart-geom_line
+scatterplot-geom_point
+polygons-geom_polygon
+histograms-geom_histogram
+drawing-rectangles-geom_rect-geom_tile-geom_raster
+add-text-to-a-plot-geom_text
+exercise-solutions
+exercise-1
+exercise-2
+exercise-3
meeting-videos-2
cohort-1-2
collective-geoms
diff --git a/references-1.html b/references-1.html
index de3fcc16..62b2a441 100644
--- a/references-1.html
+++ b/references-1.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/references.html b/references.html
index 31cf51de..739c3b23 100644
--- a/references.html
+++ b/references.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/remember-tidytuesday.html b/remember-tidytuesday.html
index 9788fd25..0d74935d 100644
--- a/remember-tidytuesday.html
+++ b/remember-tidytuesday.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/resources-1.html b/resources-1.html
index d437bd98..6f06cf39 100644
--- a/resources-1.html
+++ b/resources-1.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/revealing-uncertainty.html b/revealing-uncertainty.html
index fb0d41d1..da192537 100644
--- a/revealing-uncertainty.html
+++ b/revealing-uncertainty.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -601,26 +607,26 @@
4.2 Revealing Uncertaintygeom_crossbar()
, geom_pointrange()
3. Continuous x, range: geom_ribbon()
4. Continuous x, range & center: geom_smooth(stat = "identity")
-y <- c(18, 11, 16)
-df <- data.frame(x = 1:3, y = y, se = c(1.2, 0.5, 1.0))
-
-base <- ggplot(df, aes(x, y, ymin = y - se, ymax = y + se))
-base + geom_crossbar()
+y <- c(18, 11, 16)
+df <- data.frame(x = 1:3, y = y, se = c(1.2, 0.5, 1.0))
+
+base <- ggplot(df, aes(x, y, ymin = y - se, ymax = y + se))
+base + geom_crossbar()
-
+
-
+
-
+
## x y se
## 1 1 18 1.2
## 2 2 11 0.5
## 3 3 16 1.0
-
+
-
+
-
+
diff --git a/scale-breaks.html b/scale-breaks.html
index 2c647057..e9dfab17 100644
--- a/scale-breaks.html
+++ b/scale-breaks.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/scale-guides.html b/scale-guides.html
index 3508793c..530de8ee 100644
--- a/scale-guides.html
+++ b/scale-guides.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/scale-limits.html b/scale-limits.html
index 7b7020fe..f156a7d9 100644
--- a/scale-limits.html
+++ b/scale-limits.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/scale-transformation.html b/scale-transformation.html
index ea3efbe4..c807eb95 100644
--- a/scale-transformation.html
+++ b/scale-transformation.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/scales-1.html b/scales-1.html
index 1c468622..7d9010fe 100644
--- a/scales-1.html
+++ b/scales-1.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -599,7 +605,7 @@
19.6 Scalesprint(scale_fill_viridis_c)
+
function (name = waiver(), ..., alpha = 1, begin = 0, end = 1,
direction = 1, option = "D", values = NULL, space = "Lab",
na.value = "grey50", guide = "colourbar", aesthetics = "fill")
@@ -608,7 +614,7 @@ 19.6 Scales
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/scaling.html b/scaling.html
index faadcc3c..6dfddf44 100644
--- a/scaling.html
+++ b/scaling.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -612,58 +618,58 @@
21.3 Scalingshapes <- data.frame(
- shape = c(0:19, 22, 21, 24, 23, 20),
- x = 0:24 %/% 5,
- y = -(0:24 %% 5)
-)
-ggplot(shapes, aes(x, y)) +
- geom_point(aes(shape = shape), size = 5, fill = "red") +
- geom_text(aes(label = shape), hjust = 0, nudge_x = 0.15) +
- scale_shape_identity() +
- expand_limits(x = 4.1) +
- theme_void()
-
+shapes <- data.frame(
+ shape = c(0:19, 22, 21, 24, 23, 20),
+ x = 0:24 %/% 5,
+ y = -(0:24 %% 5)
+)
+ggplot(shapes, aes(x, y)) +
+ geom_point(aes(shape = shape), size = 5, fill = "red") +
+ geom_text(aes(label = shape), hjust = 0, nudge_x = 0.15) +
+ scale_shape_identity() +
+ expand_limits(x = 4.1) +
+ theme_void()
+
Line type
Line types can be specified with:
An integer or name: 0 = blank, 1 = solid, 2 = dashed, 3 = dotted, 4 = dotdash, 5 = longdash, 6 = twodash, as shown below:
-lty <- c("solid", "dashed", "dotted", "dotdash", "longdash", "twodash")
-linetypes <- data.frame(
- y = seq_along(lty),
- lty = lty
-)
-ggplot(linetypes, aes(0, y)) +
- geom_segment(aes(xend = 5, yend = y, linetype = lty)) +
- scale_linetype_identity() +
- geom_text(aes(label = lty), hjust = 0, nudge_y = 0.2) +
- scale_x_continuous(NULL, breaks = NULL) +
- scale_y_reverse(NULL, breaks = NULL)
-
+lty <- c("solid", "dashed", "dotted", "dotdash", "longdash", "twodash")
+linetypes <- data.frame(
+ y = seq_along(lty),
+ lty = lty
+)
+ggplot(linetypes, aes(0, y)) +
+ geom_segment(aes(xend = 5, yend = y, linetype = lty)) +
+ scale_linetype_identity() +
+ geom_text(aes(label = lty), hjust = 0, nudge_y = 0.2) +
+ scale_x_continuous(NULL, breaks = NULL) +
+ scale_y_reverse(NULL, breaks = NULL)
+
Font face
There are only three fonts that are guaranteed to work everywhere: “sans” (the default), “serif”, or “mono”:
-df <- data.frame(x = 1, y = 3:1, family = c("sans", "serif", "mono"))
-ggplot(df, aes(x, y)) +
- geom_text(aes(label = family, family = family))
-
+df <- data.frame(x = 1, y = 3:1, family = c("sans", "serif", "mono"))
+ggplot(df, aes(x, y)) +
+ geom_text(aes(label = family, family = family))
+
Colour and fill
Note that shapes 21-24 have both stroke colour and a fill. The size of the filled part is controlled by size, the size of the stroke is controlled by stroke. Each is measured in mm, and the total size of the point is the sum of the two. Note that the size is constant along the diagonal in the following figure.
-sizes <- expand.grid(size = (0:3) * 2, stroke = (0:3) * 2)
-ggplot(sizes, aes(size, stroke, size = size, stroke = stroke)) +
- geom_abline(slope = -1, intercept = 6, colour = "white", size = 6) +
- geom_point(shape = 21, fill = "red") +
- scale_size_identity()
-
+sizes <- expand.grid(size = (0:3) * 2, stroke = (0:3) * 2)
+ggplot(sizes, aes(size, stroke, size = size, stroke = stroke)) +
+ geom_abline(slope = -1, intercept = 6, colour = "white", size = 6) +
+ geom_point(shape = 21, fill = "red") +
+ scale_size_identity()
+
Horizontal and vertical justification
have the same parameterisation, either a string (“top”, “middle”, “bottom”, “left”, “center”, “right”) or a number between 0 and 1:
top = 1, middle = 0.5, bottom = 0
left = 0, center = 0.5, right = 1
-just <- expand.grid(hjust = c(0, 0.5, 1), vjust = c(0, 0.5, 1))
-just$label <- paste0(just$hjust, ", ", just$vjust)
-
-ggplot(just, aes(hjust, vjust)) +
- geom_point(colour = "grey70", size = 5) +
- geom_text(aes(label = label, hjust = hjust, vjust = vjust))
-
+just <- expand.grid(hjust = c(0, 0.5, 1), vjust = c(0, 0.5, 1))
+just$label <- paste0(just$hjust, ", ", just$vjust)
+
+ggplot(just, aes(hjust, vjust)) +
+ geom_point(colour = "grey70", size = 5) +
+ geom_text(aes(label = label, hjust = hjust, vjust = vjust))
+
diff --git a/geom_line-connects-points-from-left-to-right..html b/scatterplot-geom_point.html
similarity index 91%
rename from geom_line-connects-points-from-left-to-right..html
rename to scatterplot-geom_point.html
index aa6b7df0..fba7e3ea 100644
--- a/geom_line-connects-points-from-left-to-right..html
+++ b/scatterplot-geom_point.html
@@ -4,18 +4,18 @@
- 2.6 geom_line() connects points from left to right. | ggplot2 Book Club
+ 2.5 Scatterplot: geom_point() | ggplot2 Book Club
-
+
-
+
@@ -23,15 +23,15 @@
-
+
-
-
+
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -594,18 +600,28 @@
-
-
+
+
diff --git a/search_index.json b/search_index.json
index df296cc7..b2e85c2f 100644
--- a/search_index.json
+++ b/search_index.json
@@ -1 +1 @@
-[["index.html", "ggplot2 Book Club Welcome", " ggplot2 Book Club The Data Science Learning Community 2024-08-01 Welcome This is a companion for the book ggplot2: Elegant Graphics for Data Analysis by Hadley Wickham, Danielle Navarro, and Thomas Lin Pedersen. This companion is available at https://r4ds.github.io/bookclub-ggplot2. This website is being developed by the Data Science Learning Community. Follow along, and join the community to participate. This companion follows the Data Science Learning Community Code of Conduct. "],["book-club-meetings.html", "Book club meetings", " Book club meetings Each week, a volunteer will present a chapter from the book. This is the best way to learn the material. Presentations will usually consist of a review of the material, a discussion, and/or a demonstration of the principles presented in that chapter. More information about how to present is available in the github repo. Presentations will be recorded, and will be available on the Data Science Learning Community YouTube Channel. "],["introduction.html", "Introduction", " Introduction Learning objectives: Introduce yourself! Determine whether this club is for you. We will go over the different sections of the book. "],["hi-my-name-is.html", "Hi, my name is…", " Hi, my name is… Camera on or raise your hand if you’re willing to introduce yourself! Name Location and/or timezone Any previous DSLC clubs? Why are you here? "],["present-a-chapter.html", "Present a chapter!", " Present a chapter! Each member of the book club will have the opportunity to lead a chapter. We recommend the following format: Use the slides. Try to not use the book, remember we only have one hour! However, sometimes it could be useful to jump into RStudio and have the code ready to create a graph. Remember to increase font size to 14 (at least) by going to Tools > Global Options > Appearance > Editor font size Try to keep all the content in one visible slide. Pick chapters that interest you either because it’s content you know but would like to learn more of or chapters of things you want to get better at. Follow the How to present instructions on the GitHub README for this Book Club Start each session with start in the comments and end the session with end Introduce the chapter by saying the name of the book we are reading, the cohort, the chapter, and your name. If the book has exercises or you have a specific question regarding something about the chapter, then make sure you have the code ready in RStudio so we can go over this in the last 10 min of the hour. "],["remember-tidytuesday.html", "Remember #TidyTuesday", " Remember #TidyTuesday #TidyTuesday is a great source to keep handy when you are trying to learn about ggplot2. You can follow the hashtag on X, Mastodon, or BlueSky and find other researchers posting links to their GitHub repos. I have learned a lot by studying these repos. "],["welcome-to-ggplot2.html", "Welcome to ggplot2", " Welcome to ggplot2 ggplot2 has an underlying grammar, based on the Grammar of Graphics (Wilkinson 2005), that allows you to compose graphs by combining independent components. You can produce publication-quality graphics in seconds. However, ggplot2’s comprehensive themeing system makes it easy to do what you want. ggplot2 is designed to work iteratively. You start with a layer that shows the raw data. Then you add layers of annotations and statistical summaries. "],["grammar-of-graphics.html", "Grammar of graphics", " Grammar of graphics The grammar tells us that a graphic maps the data to the aesthetic attributes (color, shape, size) of geometric objects (points, lines, bars). The plot may also include statistical transformations of the data and information about the plot’s coordinate system. Faceting can be used to plot for different subsets of the data. The combination of these independent components are what make up a graphic. "],["mapping-components.html", "Mapping components", " Mapping components Plots are composed of the data, the information you want to visualise, and a mapping, the description of how the data’s variables are mapped to aesthetic attributes. There are five mapping components: Layer is a collection of geometric elements and statistical transformations. Geoms for short. Scale: maps values in the data space to values in the aesthetic space. Coord: coordinate system, describes data coordinates to the plane of the graphic. Facet: specifies how to break up and display subsets of data as small multiples. Theme: controls the finer points of display. "],["about-this-book.html", "About this book", " About this book Chapter 2: This chapter introduces several important ggplot2 concepts: geoms, aesthetic mappings and facetting. Chapter 3-9: explore how to use the basic toolbox to solve a wide range of visualisation problems that you’re likely to encounter in practice. Chapter 10-12: show you how to control the most important scales, allowing you to tweak the details of axes and legends. Chapter 13: demonstrates how to add additional layers to your plot, exercising full control over the geoms and stats used within them. Chapter 10-12: will show you what scales are available, how to adjust their parameters, and how to control the appearance of axes and legends. Section 13.7: Faceting is a very powerful graphical tool as it allows you to rapidly compare different subsets of your data. Chapter 17: you will learn about how to control the theming system of ggplot2 and how to save plots to disk. "],["prerequisites.html", "Prerequisites", " Prerequisites install.packages(c( "colorBlindness", "directlabels", "dplyr", "ggforce", "gghighlight", "ggnewscale", "ggplot2", "ggraph", "ggrepel", "ggtext", "ggthemes", "hexbin", "Hmisc", "mapproj", "maps", "munsell", "ozmaps", "paletteer", "patchwork", "rmapshaper", "scico", "seriation", "sf", "stars", "tidygraph", "tidyr", "wesanderson" )) "],["meeting-videos.html", "Meeting Videos", " Meeting Videos 0.0.1 Cohort 1 Meeting chat log 00:23:18 Michael Haugen: Could we do 12:30pm CST? I have a meeting until then. 00:25:25 Michael Haugen: Thanks! 00:45:10 Kent Johnson: GitHub repo:https://r4ds.github.io/bookclub-ggplot2/ "],["first-steps.html", "Chapter 1 First Steps ", " Chapter 1 First Steps "],["general-housekeeping-items.html", "1.1 General Housekeeping Items", " 1.1 General Housekeeping Items This is a learning opportunity so feel free to ask any question at any time. Take time to learn the theory, in particular Grammar of Graphics. Please do the chapter exercises. Second-best learning opportunity! Please plan to facilitate one of the discussions. Best learning opportunity! "],["learning-objectives.html", "1.2 Learning Objectives", " 1.2 Learning Objectives Brief introduction to ggplot’s capabilities Learn about key components of every plot: data, aesthetics, geoms Learn about faceting See a few different geoms Modify the axes Save the plot to disk "],["introduction-1.html", "1.3 Introduction", " 1.3 Introduction Leland Wilkinson (Grammar of Graphics, 1999) formalized two main principles in his plotting framework: Graphics = distinct layers of grammatical elements Meaningful plots through aesthetic mappings The essential grammatical elements to create any visualization with {ggplot2} are: "],["main-data-set.html", "1.4 Main data set", " 1.4 Main data set For this chapter, we’ll mainly use the mpg dataset that comes with ggplot. mpg ## # A tibble: 234 × 11 ## manufacturer model displ year cyl trans drv cty hwy fl class ## <chr> <chr> <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr> ## 1 audi a4 1.8 1999 4 auto… f 18 29 p comp… ## 2 audi a4 1.8 1999 4 manu… f 21 29 p comp… ## 3 audi a4 2 2008 4 manu… f 20 31 p comp… ## 4 audi a4 2 2008 4 auto… f 21 30 p comp… ## 5 audi a4 2.8 1999 6 auto… f 16 26 p comp… ## 6 audi a4 2.8 1999 6 manu… f 18 26 p comp… ## 7 audi a4 3.1 2008 6 auto… f 18 27 p comp… ## 8 audi a4 quattro 1.8 1999 4 manu… 4 18 26 p comp… ## 9 audi a4 quattro 1.8 1999 4 auto… 4 16 25 p comp… ## 10 audi a4 quattro 2 2008 4 manu… 4 20 28 p comp… ## # ℹ 224 more rows cty and hwy are miles per gallon measures displ is engine displacement in litres drv is front wheel (f), rear wheel (r) or four wheel (4) model is the model of the car class is two-seater, SUV, compact, etc. "],["components-of-every-plot.html", "1.5 Components of every plot", " 1.5 Components of every plot Three components ggplot(mpg, aes(x = displ, y = hwy)) + geom_point() It’s allowable to omit the x = and y = arguments of aes. In other words, aes(displ, hwy) would be valid for this plot. "],["other-aesthetic-attributes.html", "1.6 Other aesthetic attributes", " 1.6 Other aesthetic attributes color, shape and size can be mapped to variables in the data The class variable of the mpg dataset has seven unique values. The plot can assign a specific color to each value by mapping class to color within the aesthetic function. ## # A tibble: 7 × 1 ## class ## <chr> ## 1 compact ## 2 midsize ## 3 suv ## 4 2seater ## 5 minivan ## 6 pickup ## 7 subcompact ggplot(mpg, aes(displ, hwy, color = class)) + geom_point() Including a color assignment outside the aesthetic of the geometry layer will make all of the points that color. ggplot(mpg, aes(displ, hwy)) + geom_point(color = "blue") Mapping a variable to shape and color adds some diversity and information to the plot. ggplot(mpg, aes(displ, hwy, shape = drv, color = drv)) + geom_point() Mapping a variable to size can also add some new insights. ggplot(mpg, aes(manufacturer, drv, size = displ)) + geom_point() + theme(axis.text.x = element_text(angle = 90)) "],["faceting.html", "1.7 Faceting", " 1.7 Faceting Faceting creates graphics by splitting the data into subsets and displaying the same graph for each subset. Really helpful if there are lots of values, making color/shape less meaningful. ggplot(mpg, aes(displ, hwy)) + geom_point() + facet_wrap(~class) Exercise: Use faceting to explore the three-way relationship between fuel economy, engine size and number of cylinders. How does faceting by number of cylinders change your assessment of the relationship between engine size and fuel economy? "],["geoms.html", "1.8 Geoms", " 1.8 Geoms The geom_point() geom gives a familiar scatterplot. Other geoms include: geom_smooth() which fits a smooth line to the data check help to see geom_smooth’s arguments like method, se or span. ggplot(mpg, aes(displ, hwy)) + geom_point() + geom_smooth() geom_boxplot() which generates a box-and-whisker plot check help to see geom_boxplot’s arguments like outlier arguments, and coef which adjusts the whisker length. ggplot(mpg, aes(drv, hwy)) + geom_boxplot() consider boxplot variants like geom_jitter and geom_violin ggplot(mpg, aes(drv, hwy)) + geom_jitter() ggplot(mpg, aes(drv, hwy)) + geom_violin() geom_histogram which generates a histogram and geom_freqpoly which generates a frequency polygon check help to see geom_histogram’s arguments like position and binwidth. ggplot(mpg, aes(hwy)) + geom_histogram() ## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`. ggplot(mpg, aes(hwy)) + geom_freqpoly() ## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`. geom_bar which generates a bar chart check help to see geom_bar’s arguments like position and width ggplot(diamonds, aes(cut)) + geom_bar() This graph below uses displ for y in the aesthetic and uses the stat of identity so that it sums the total displacement for each manufacturer. ggplot(mpg, aes(manufacturer, displ)) + geom_bar(stat = "identity") This plot now shows the total displacement. mpg %>% group_by(manufacturer) %>% summarize(sum(displ)) ## # A tibble: 15 × 2 ## manufacturer `sum(displ)` ## <chr> <dbl> ## 1 audi 45.8 ## 2 chevrolet 96.2 ## 3 dodge 162 ## 4 ford 113. ## 5 honda 15.4 ## 6 hyundai 34 ## 7 jeep 36.6 ## 8 land rover 17.2 ## 9 lincoln 16.2 ## 10 mercury 17.6 ## 11 nissan 42.5 ## 12 pontiac 19.8 ## 13 subaru 34.4 ## 14 toyota 100. ## 15 volkswagen 60.9 geom_line and geom_path which generates a line chart or path chart (useful for time series data) check help to see geom_line’s arguments like lineend and arrow ggplot(economics, aes(date, unemploy / pop)) + geom_line() ggplot(economics, aes(date, uempmed)) + geom_line() To investigate these plots further, we can draw them on the same plot. year <- function(x) as.POSIXlt(x)$year + 1900 ggplot(economics, aes(unemploy / pop, uempmed)) + geom_path(color = "grey50") + geom_point(aes(color = year(date))) "],["modifying-the-axes.html", "1.9 Modifying the Axes", " 1.9 Modifying the Axes xlab() and ylab() modify the axis labels ggplot(mpg, aes(cty, hwy)) + geom_point(alpha = 1/3) ggplot(mpg, aes(cty, hwy)) + geom_point(alpha = 1/3) + xlab("city driving (mpg)") + ylab("highway driving (mpg)") # remove labels with NULL ggplot(mpg, aes(cty, hwy)) + geom_point(alpha = 1/3) + xlab(NULL) + ylab(NULL) xlim() and ylim() modify the limits of the axes (boundaries) ggplot(mpg, aes(drv, hwy)) + geom_jitter(width = 0.25) ggplot(mpg, aes(drv, hwy)) + geom_jitter(width = 0.25) + xlim("f", "r") + ylim(20, 30) ## Warning: Removed 138 rows containing missing values or values outside the scale range ## (`geom_point()`). "],["output.html", "1.10 Output", " 1.10 Output Save the plot to a variable p <- ggplot(mpg, aes(displ, hwy, color = factor(cyl))) + geom_point() Then print it print(p) Save it to disk ggsave("plot.png", p, width = 5, height = 5) Describe its structure summary(p) ## data: manufacturer, model, displ, year, cyl, trans, drv, cty, hwy, fl, ## class [234x11] ## mapping: x = ~displ, y = ~hwy, colour = ~factor(cyl) ## faceting: <ggproto object: Class FacetNull, Facet, gg> ## compute_layout: function ## draw_back: function ## draw_front: function ## draw_labels: function ## draw_panels: function ## finish_data: function ## init_scales: function ## map_data: function ## params: list ## setup_data: function ## setup_params: function ## shrink: TRUE ## train_scales: function ## vars: function ## super: <ggproto object: Class FacetNull, Facet, gg> ## ----------------------------------- ## geom_point: na.rm = FALSE ## stat_identity: na.rm = FALSE ## position_identity "],["meeting-videos-1.html", "1.11 Meeting Videos", " 1.11 Meeting Videos 1.11.1 Cohort 1 Meeting chat log 00:04:11 Lydia Gibson: Hello! I missed last week but hoping to join weekly moving forward. 00:37:49 June Choe: there's a good cheatsheet for this -- https://ggplot2tor.com/aesthetics 00:54:29 Michael Haugen: One can use geom_col() as well which will work similar to stats = identity 00:58:12 Michael Haugen: section 3.8 in R4DS may be relevant here as well: https://r4ds.had.co.nz/data-visualisation.html 01:07:43 Federica Gazzelloni: thn 01:07:48 June Choe: thanks! "],["individual-geoms.html", "Chapter 2 Individual Geoms", " Chapter 2 Individual Geoms Geoms are the fundamental building blocks of ggplot2. Most of the geoms are associated with a named plot. Some geoms can be added on to low-level geoms to create more complex plots. To find out more about individual geoms see their documentation. "],["scatterplot.html", "2.1 Scatterplot:", " 2.1 Scatterplot: ggplot(mpg, aes(x = displ, y = hwy)) + geom_point() "],["line-plot.html", "2.2 Line plot:", " 2.2 Line plot: ggplot(economics, aes(date, unemploy / pop)) + geom_line() "],["histogram.html", "2.3 Histogram:", " 2.3 Histogram: ggplot(mpg, aes(hwy)) + geom_histogram() ## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`. "],["bar-chart.html", "2.4 Bar chart", " 2.4 Bar chart ggplot(mpg, aes(manufacturer)) + geom_bar() ## geom_path() connects points in order of appearance. p + geom_path() "],["geom_polygon-draws-polygons-which-are-filled-paths..html", "2.5 geom_polygon() draws polygons which are filled paths.", " 2.5 geom_polygon() draws polygons which are filled paths. p + geom_polygon() "],["geom_line-connects-points-from-left-to-right..html", "2.6 geom_line() connects points from left to right.", " 2.6 geom_line() connects points from left to right. p + geom_line() "],["what-low-level-geoms-are-used-to-draw-geom_smooth.html", "2.7 What low-level geoms are used to draw geom_smooth()?", " 2.7 What low-level geoms are used to draw geom_smooth()? Geom_smooth() fits a smoother to data, displaying the smooth and its standard error, allowing you to see a dominant pattern within a scatterplot with a lot of “noise”. The low level geom for geom_smooth() are geom_path(), geom_area() and geom_point(). ggplot(mpg, aes(displ, hwy)) + geom_point() + geom_smooth() ## `geom_smooth()` using method = 'loess' and formula = 'y ~ x' "],["what-low-level-geoms-are-used-to-draw-geom_boxplot.html", "2.8 What low-level geoms are used to draw geom_boxplot()?", " 2.8 What low-level geoms are used to draw geom_boxplot()? Box plots are used to summarize the distribution of a set of points using summary statistics. The low level geom for geom_boxplot() are geom_rect(), geom_line() and geom_point(). ggplot(mpg, aes(drv, hwy)) + geom_boxplot() "],["what-low-level-geoms-are-used-to-draw-geom_violin.html", "2.9 What low-level geoms are used to draw geom_violin()?", " 2.9 What low-level geoms are used to draw geom_violin()? Violin plots show a compact representation of the density of the distribution highlighting the areas where most of the points are found. The low level geom for geom_violin() are geom_area() and geom_path(). ggplot(mpg, aes(drv, hwy)) + geom_violin() "],["meeting-videos-2.html", "2.10 Meeting Videos", " 2.10 Meeting Videos 2.10.1 Cohort 1 Meeting chat log 00:13:39 priyanka gagneja: I forgot to mention that since this is relatively smaller chapter Ryan has prepared some material introducing Chapter 4 for today and he will talking about the entire chapter next week. 00:16:38 priyanka gagneja: that's correct 00:16:42 priyanka gagneja: that's my understanding too 00:18:38 priyanka gagneja: what do you mean circles .. can you share a more detailed example 00:21:59 Jiwan Heo: tibble(id = 1:10) %>% mutate(x = cos(2*pi*id/10), y = sin(2*pi*id/10)) %>% ggplot(aes(x, y)) + geom_line() + coord_equal() 00:22:05 Jiwan Heo: vs tibble(id = 1:10) %>% mutate(x = cos(2*pi*id/10), y = sin(2*pi*id/10)) %>% ggplot(aes(x, y)) + geom_path() + coord_equal() 00:35:42 priyanka gagneja: Thank you Ryan !! 00:38:16 priyanka gagneja: need a min 00:52:22 Michael Haugen: “Side rail” no pun intended 00:52:34 Ryan S: lol 00:52:34 Michael Haugen: sounds great 00:54:02 Ryan Metcalf: I was going to use Derail…..no pun intended! 00:54:05 Ryan Metcalf: Thanks you everyone! "],["collective-geoms.html", "Chapter 3 Collective Geoms ", " Chapter 3 Collective Geoms "],["general-housekeeping-items-1.html", "3.1 General Housekeeping Items", " 3.1 General Housekeeping Items This is a learning opportunity so feel free to ask any question at any time. Take time to learn the theory, in particular Grammar of Graphics. Please do the chapter exercises. Second-best learning opportunity! Please plan to facilitate one of the discussions. Best learning opportunity! "],["learning-objectives-1.html", "3.2 Learning Objectives", " 3.2 Learning Objectives Understand the difference between individual geoms and collective geoms Explore some plots that use individual and collective geoms together Reinforce understand of the Grammar of Graphics (particularly the use of layers) to create plots "],["quick-intuition-on-collective-geoms.html", "3.3 Quick Intuition on Collective Geoms", " 3.3 Quick Intuition on Collective Geoms Last chapter was on individual geoms. This chapter is on collective geoms. Oversimplification (but maybe useful) individual numbers vs the sum of the numbers sum converts a series of numbers (“individual”): 4, 7, 9, 3, 3 to a single number (“collective”): 26 home prices under individual geoms each home price has a point on a plot/table under collective geoms we may use median as a single number that summarizes all individuals This blog post by Simon Jackson illustrates these foundations using mtcars. The points are individual geoms and the bars are a collective geom showing the average of the individual observations. id <- mtcars %>% tibble::rownames_to_column() %>% as_tibble() %>% mutate(am = factor(am, levels = c(0, 1), labels = c("automatic", "manual"))) gd <- id %>% group_by(am) %>% summarise(hp = mean(hp)) ggplot(id, aes(x = am, y = hp, color = am, fill = am)) + geom_bar(data = gd, stat = "identity", alpha = 0.3) + ggrepel::geom_text_repel(aes(label = rowname), color = "black", size = 2.5, segment.color = "grey") + geom_point() + guides(color = "none", fill = "none") + theme_bw() + labs( title = "Car horespower by transmission type", x = "Transmission", y = "Horsepower" ) Next, a separate longitudinal study from the blog post (because the book example is also a longitudinal study). This example uses the ourworldindata dataset which shows healthcare spending per country over time. #library(devtools) #install_github("drsimonj/ourworldindata") library(ourworldindata) id <- financing_healthcare %>% filter(continent %in% c("Oceania", "Europe") & between(year, 2001, 2005)) %>% select(continent, country, year, health_exp_total) %>% na.omit() raw data id ## # A tibble: 275 × 4 ## continent country year health_exp_total ## <chr> <chr> <int> <dbl> ## 1 Europe Albania 2001 198. ## 2 Europe Albania 2002 225. ## 3 Europe Albania 2003 236. ## 4 Europe Albania 2004 264. ## 5 Europe Albania 2005 277. ## 6 Europe Andorra 2001 1432. ## 7 Europe Andorra 2002 1565. ## 8 Europe Andorra 2003 1601. ## 9 Europe Andorra 2004 1662. ## 10 Europe Andorra 2005 1794. ## # ℹ 265 more rows individual observations are at the combined country-year level. For the purposes of plotting, though, the “individual geom” will just be the country and all of the yearly observations for each country. gd <- id %>% group_by(continent, year) %>% summarise(health_exp_total = mean(health_exp_total)) ggplot(id, aes(x = year, y = health_exp_total, color = continent)) + geom_line(aes(group = country), alpha = 0.3) + geom_line(data = gd, alpha = 0.8, size = 3) + theme_bw() + labs( title = "Changes in healthcare spending\\nacross countries and world regions", x = NULL, y = "Total healthcare investment ($)", color = NULL ) "],["from-the-ggplot2-book.html", "3.4 From the ggplot2 book", " 3.4 From the ggplot2 book dataset called Oxboys which shows the age and corresponding height of 26 boys from Oxford. also a longitudinal study. note that the age is standardized. data(Oxboys, package = "nlme") head(Oxboys, 9) ## Grouped Data: height ~ age | Subject ## Subject age height Occasion ## 1 1 -1.0000 140.5 1 ## 2 1 -0.7479 143.4 2 ## 3 1 -0.4630 144.8 3 ## 4 1 -0.1643 147.1 4 ## 5 1 -0.0027 147.7 5 ## 6 1 0.2466 150.2 6 ## 7 1 0.5562 151.7 7 ## 8 1 0.7781 153.3 8 ## 9 1 0.9945 155.8 9 3.4.1 Multiple Groups, One Aesthetic As the book says: In many situations, you want to separate your data into groups, but render them in the same way. In other words, you want to be able to distinguish individual subjects but not identify them. sometimes you want the individual geom to be a group of observations for the same individual. you do this by adding a group argument to the aesthetic. If you’re trying to figure out which variable to use as the grouping variable, fill in the blank “I have multiple observations for each _____”. Or for longitudinal studies, “I want to plot one line over time for each _____”. What’s the grouping variable for Oxboys? In the case of Oxboys, we want to plot a line over time for each boy, so Subject is the grouping variable in the aesthetic. ggplot(Oxboys, aes(age, height, group = Subject)) + geom_point() + geom_line() incorrectly specifying the grouping variable leads to a “characteristic sawtooth appearance”. ggplot(Oxboys, aes(age, height)) + geom_point() + geom_line() 3.4.2 Different Groups on Different Layers From the book: Sometimes we want to plot summaries that use different levels of aggregation: one layer might display individuals, while another displays an overall summary. now that we have plotted individual geoms, let’s add a collective geom which is the trendline for all boys together. ggplot(Oxboys, aes(age, height, group = Subject)) + geom_line() + geom_point() + geom_smooth(method = "lm", se = FALSE) ## `geom_smooth()` using formula = 'y ~ x' #> `geom_smooth()` using formula 'y ~ x' something doesn’t look right expecting a collective geom (one summary line for all subjects), but we got individual geoms again – a trendline for each individual instead of a trendline for all individuals. “grouping controls both the display of the geoms, and the operation of the stats: one statistical transformation is run for each group”. we got multiple geom_smooths because we had the grouping variable in the ggplot line so the grouping flows down to all layers of the plot to get what we intend, we need to uncouple the grouping variable at the ggplot layer and add it where we want the grouping to happen, namely only at the geom_line layer. That allows the default grouping from the ggplot layer (i.e., no special grouping or just group on the whole dataset) to flow down to the geom_smooth layer. ggplot(Oxboys, aes(age, height)) + geom_line(aes(group = Subject)) + geom_point() + geom_smooth(method = "lm", size = 2, se = FALSE) ## `geom_smooth()` using formula = 'y ~ x' #> `geom_smooth()` using formula 'y ~ x' 3.4.3 Overriding the Default Grouping In the last exercise, we finally got the grouping right. This hints at the approach of overriding the default grouping. By adding the grouping to geom_line, we overrode the default grouping, which was “no special grouping”. Here’s another example to help illustrate this point a little better. Thanks to this blog post. Subtitles are added to these plots to describe what’s going on. ggplot(mpg, aes(drv, hwy)) + geom_jitter() + stat_boxplot(fill = NA) + labs(subtitle = "stat_boxplot automatically uses the groups set by the categorical variable drv.\\nNotice that there is only one boxplot for each value of drv.") ggplot(mpg, aes(drv, hwy, color = factor(year))) + geom_jitter() + stat_boxplot(fill = NA) + labs(subtitle = "by now adding color based on year, it creates a new group for the boxplots as well,\\nand there are now two for each categorical. This may not be what you want.") ggplot(mpg, aes(drv, hwy, color = factor(year))) + geom_jitter() + stat_boxplot(fill = NA, aes(group = drv)) + labs(subtitle = "we override the default or earlier grouping by adding\\na group -- inside the aes -- on the layer where we want it") ## Warning: The following aesthetics were dropped during statistical transformation: ## colour. ## ℹ This can happen when ggplot fails to infer the correct grouping structure in ## the data. ## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical ## variable into a factor? 3.4.4 A couple of exercises mpg %>% head(2) ## # A tibble: 2 × 11 ## manufacturer model displ year cyl trans drv cty hwy fl class ## <chr> <chr> <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr> ## 1 audi a4 1.8 1999 4 auto(l5) f 18 29 p compa… ## 2 audi a4 1.8 1999 4 manual(m5) f 21 29 p compa… #Draw a boxplot of hwy for each value of cyl, without turning cyl into a factor. What extra aesthetic do you need to set? # Wrong... but cyl is an integer data type -- are integers considered continuous? ggplot(mpg, aes(cyl, hwy)) + geom_boxplot() ## Warning: Continuous x aesthetic ## ℹ did you forget `aes(group = ...)`? # Right ggplot(mpg, aes(cyl, hwy, group = as.factor(cyl))) + geom_boxplot() #Modify the following plot so that you get one boxplot per integer value of displ. ggplot(mpg, aes(displ, cty)) + geom_boxplot() ## Warning: Continuous x aesthetic ## ℹ did you forget `aes(group = ...)`? # probably better ways to do this, especially ways to make the boxplot line up with the x-axis ggplot(mpg, aes(x = ceiling(displ), cty, group = ceiling(displ))) + geom_boxplot() 3.4.5 Matching Aesthetics to Graphic Objects (Not covered in the preso) "],["meeting-videos-3.html", "3.5 Meeting Videos", " 3.5 Meeting Videos 3.5.1 Cohort 1 Meeting chat log 00:21:57 Michael Haugen: only thing I can think of is if 1 equals the first column of the data frame. 01:02:43 priyanka gagneja: thanks Ryan and everyone else 01:06:10 Jiwan Heo: https://github.com/r4ds/bookclub-ps4ds "],["statistical-summaries.html", "Chapter 4 Statistical Summaries", " Chapter 4 Statistical Summaries Learning Objectives: Use ggplot2 to plot possible uncertainty in your data Determine which geometric object (geom) best presents your type of data "],["defintions-in-this-chapter.html", "4.1 Defintions (in this Chapter)", " 4.1 Defintions (in this Chapter) discrete value: a finite number, something that is countable with beginning and end (input user definition welcomed) continuous value: infinite number, something that never ends. Infinity is continous. (input user definition welcomed) grobs: graphical object overplotting: too much data on scatterplot making underlying relationships obscure "],["revealing-uncertainty.html", "4.2 Revealing Uncertainty", " 4.2 Revealing Uncertainty Four primary types of geometric objects (geom) are used: 1. Discrete x, range: geom_errorbar(), geom_linerange() 2. Discrete x, range & center: geom_crossbar(), geom_pointrange() 3. Continuous x, range: geom_ribbon() 4. Continuous x, range & center: geom_smooth(stat = \"identity\") y <- c(18, 11, 16) df <- data.frame(x = 1:3, y = y, se = c(1.2, 0.5, 1.0)) base <- ggplot(df, aes(x, y, ymin = y - se, ymax = y + se)) base + geom_crossbar() base + geom_pointrange() base + geom_smooth(stat = "identity") df ## x y se ## 1 1 18 1.2 ## 2 2 11 0.5 ## 3 3 16 1.0 base + geom_errorbar() base + geom_linerange() base + geom_ribbon() "],["weighted-data.html", "4.3 Weighted Data", " 4.3 Weighted Data If each row of your dataframe contains multiple observations, we can use a weight to visually give scale to observations # Unweighted ggplot(midwest, aes(percwhite, percbelowpoverty)) + geom_point() # Weight by population ggplot(midwest, aes(percwhite, percbelowpoverty)) + geom_point(aes(size = poptotal / 1e6)) + scale_size_area("Population\\n(millions)", breaks = c(0.5, 1, 2, 4)) # Unweighted ggplot(midwest, aes(percwhite, percbelowpoverty)) + geom_point() + geom_smooth(method = lm, size = 1) ## `geom_smooth()` using formula = 'y ~ x' # Weighted by population ggplot(midwest, aes(percwhite, percbelowpoverty)) + geom_point(aes(size = poptotal / 1e6)) + geom_smooth(aes(weight = poptotal), method = lm, size = 1) + scale_size_area(guide = "none") ## `geom_smooth()` using formula = 'y ~ x' ggplot(midwest, aes(percbelowpoverty)) + geom_histogram(binwidth = 1) + ylab("Counties") ggplot(midwest, aes(percbelowpoverty)) + geom_histogram(aes(weight = poptotal), binwidth = 1) + ylab("Population (1000s)") Question for the group: Is the above ylab correct? Check out the next two figures, can you see the difference? ggplot(midwest, aes(percbelowpoverty)) + geom_histogram(aes(weight = poptotal/1e3), binwidth = 1) + ylab("Population (1000s)") ggplot(midwest, aes(percbelowpoverty)) + geom_histogram(aes(weight = poptotal/1e6), binwidth = 1) + ylab("Population (millions)") "],["displaying-distributions.html", "4.4 Displaying distributions", " 4.4 Displaying distributions Using built-in diamonds dataset Figure 4.1: Diamond Dimensions For 1-Dimensional continuous data (1d), the histogram is arguably the most important geom ggplot(diamonds, aes(depth)) + geom_histogram() ## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`. ggplot(diamonds, aes(depth)) + geom_histogram(binwidth = 0.1) + xlim(55, 70) ## Warning: Removed 45 rows containing non-finite outside the scale range ## (`stat_bin()`). ## Warning: Removed 2 rows containing missing values or values outside the scale range ## (`geom_bar()`). Never rely on the defaults. Always adjust your bin or xlim to “zoom” in our out of your data. There is no hard or fast rule, only experimentation to discover coorelation in your plot. For your audience/reader, ensure you add a caption for your scale, for example binwidth. Three ways to compare distribution: - Show small multiples of the histogram, facet_wrap(~ var). - Use colour and a frequency polygon, geom_freqpoly(). - Use a “conditional density plot”, geom_histogram(position = \"fill\"). ggplot(diamonds, aes(depth)) + geom_freqpoly(aes(colour = cut), binwidth = 0.1, na.rm = TRUE) + xlim(58, 68) + theme(legend.position = "none") ggplot(diamonds, aes(depth)) + geom_histogram(aes(fill = cut), binwidth = 0.1, position = "fill", na.rm = TRUE) + xlim(58, 68) + theme(legend.position = "none") You can also plot density using geom_density(). Use a density plot when you know that the underlying density is smooth, continuous and unbounded. ggplot(diamonds, aes(depth)) + geom_density(na.rm = TRUE) + xlim(58, 68) + theme(legend.position = "none") ggplot(diamonds, aes(depth, fill = cut, colour = cut)) + geom_density(alpha = 0.2, na.rm = TRUE) + xlim(58, 68) + theme(legend.position = "none") It is often the case and advisable to sacrifice quality for quantity. The following three types of graph provide examples of this thought. geom_boxplot(): ggplot(diamonds, aes(clarity, depth)) + geom_boxplot() ggplot(diamonds, aes(carat, depth)) + geom_boxplot(aes(group = cut_width(carat, 0.1))) + xlim(NA, 2.05) ## Warning: Removed 997 rows containing missing values or values outside the scale range ## (`stat_boxplot()`). geom_violin(): ggplot(diamonds, aes(clarity, depth)) + geom_violin() ggplot(diamonds, aes(carat, depth)) + geom_violin(aes(group = cut_width(carat, 0.1))) + xlim(NA, 2.05) ## Warning: Removed 997 rows containing non-finite outside the scale range ## (`stat_ydensity()`). geom_dotplot(): 4.4.1 Exercise: What binwidth tells you the most interesting story about the distribution of carat? >The number of bins or the binwidth should be exploration exercise. There is not direct hard or fast rule for scaling the binwidth. What is important is to find the appropriate size that best captures the representation (or distribution) of your analysis. This correlates to your story as you are explaining the importance. Find a binwidth that best captures your ideas. Draw a histogram of price. What interesting patterns do you see? ggplot(diamonds, aes(price)) + geom_histogram(binwidth = 5) The smaller the quantity (assuming quality), the higher the price. I presume that carat size would also have a strong correlation with quantity and price. How does the distribution of price vary with clarity? ggplot(diamonds, aes(clarity, price)) + geom_violin() ggplot(diamonds, aes(clarity, price)) + geom_boxplot() I presume using different geoms, the higher the clarity, the higher the price, the fewer the quantity. Overlay a frequency polygon and density plot of depth. What computed variable do you need to map to y to make the two plots comparable? (You can either modify geom_freqpoly() or geom_density().) Not completed. "],["dealing-with-overplotting.html", "4.5 Dealing with overplotting", " 4.5 Dealing with overplotting Scatterplot is a very important tool for assessing relationship Too large a dataset may obscure any true relationship This is called Over plotting To compensate for Over plotting, tweaking the aesthetic can help. Techniques like hollow glyphs can help. df <- data.frame(x = rnorm(2000), y = rnorm(2000)) norm <- ggplot(df, aes(x, y)) + xlab(NULL) + ylab(NULL) norm + geom_point() norm + geom_point(shape = 1) # Hollow circles norm + geom_point(shape = 96) # Pixel sized Alternative ways using large data sets, you can use alpha blending (transparency). If you specify alpha as a ratio, the denominator gives the number of points that must be over plotted to give a solid color. norm + geom_point(alpha = 1 / 3) norm + geom_point(alpha = 1 / 5) norm + geom_point(alpha = 1 / 10) geom_jitter() can be used if your data has some discreteness. By default, 40% is used. You can overide the default with width and height arguments. Alternatively, we can think of overplotting as a 2d density estimation problem, which gives rise to two more approaches: Bin the points and count the number in each bin, then visualise that count (the 2d generalisation of the histogram), geom_bin2d(). The code below compares square and hexagonal bins, using parameters bins and binwidth to control the number and size of the bins. norm + geom_bin2d() norm + geom_bin2d(bins = 10) library(hexbin) norm + geom_hex() norm + geom_hex(bins = 10) Another approach to dealing with overplotting is to add data summaries to help guide the eye to the true shape of the pattern within the data. "],["statistical-summaries-1.html", "4.6 Statistical Summaries", " 4.6 Statistical Summaries geom_histogram() and geom_bin2d() use a familiar geom, geom_bar() and geom_raster(), combined with a new statistical transformation, stat_bin() and stat_bin2d(). stat_bin() and stat_bin2d() combine the data into bins and count the number of observations in each bin. But what if we want a summary other than count? So far, we’ve just used the default statistical transformation associated with each geom. Now we’re going to explore how to use stat_summary_bin() to stat_summary_2d() to compute different summaries. ggplot(diamonds, aes(color)) + geom_bar() ggplot(diamonds, aes(color, price)) + geom_bar(stat = "summary_bin", fun = mean) ggplot(diamonds, aes(table, depth)) + geom_bin2d(binwidth = 1, na.rm = TRUE) + xlim(50, 70) + ylim(50, 70) ggplot(diamonds, aes(table, depth, z = price)) + geom_raster(binwidth = 1, stat = "summary_2d", fun = mean, na.rm = TRUE) + xlim(50, 70) + ylim(50, 70) ## Warning: Raster pixels are placed at uneven horizontal intervals and will be shifted ## ℹ Consider using `geom_tile()` instead. ## Raster pixels are placed at uneven horizontal intervals and will be shifted ## ℹ Consider using `geom_tile()` instead. So far we’ve considered two classes of geoms: Simple geoms where there’s a one-on-one correspondence between rows in the data frame and physical elements of the geom Statistical geoms where introduce a layer of statistical summaries in between the raw data and the result Although ggplot2 does not have direct 3d support, it does provide the ability to plot 2d images representing 3d data. These include: contours, colored tiles, and bubble plots. ggplot(faithfuld, aes(eruptions, waiting)) + geom_contour(aes(z = density, colour = ..level..)) ## Warning: The dot-dot notation (`..level..`) was deprecated in ggplot2 3.4.0. ## ℹ Please use `after_stat(level)` instead. ## This warning is displayed once every 8 hours. ## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was ## generated. ggplot(faithfuld, aes(eruptions, waiting)) + geom_raster(aes(fill = density)) # Bubble plots work better with fewer observations small <- faithfuld[seq(1, nrow(faithfuld), by = 10), ] ggplot(small, aes(eruptions, waiting)) + geom_point(aes(size = density), alpha = 1/3) + scale_size_area() "],["meeting-videos-4.html", "4.7 Meeting Videos", " 4.7 Meeting Videos 4.7.1 Cohort 1 Meeting chat log 00:32:41 Michael Haugen: geom_errorbar() otherwise known as tie fighter plot 00:33:12 Gustavo R. Brito: There's some good explanations about geom_smooth (and se too) in rdocumentation: https://rdocumentation.org/packages/ggplot2/versions/3.3.5/topics/geom_smooth 00:46:10 priyanka gagneja: the Grey area is the confidence interval 00:48:32 Federica Gazzelloni: The geom_smooth() function in ggplot2 can plot fitted lines from models with a simple structure. Supported model types include models fit with lm() , glm() , nls() , and mgcv::gam() . ... By default you will get confidence intervals plotted in geom_smooth() . 00:49:53 Federica Gazzelloni: This is a linear model fit, so I use method = "lm". 00:50:21 Stan Piotrowski: You can use “scale_y_continuous()” and some of the functions from the “scales” package to modify axes. 00:50:27 June Choe: the model gets fitted by StatSmooth$compute_group() here, if you're curious about the code! https://github.com/tidyverse/ggplot2/blob/759c63c2fd9e00ba3322c1b74b227f63c98d2e06/R/stat-smooth.r#L156-L173 00:51:31 Federica Gazzelloni: https://aosmith.rbind.io/2018/11/16/plot-fitted-lines/ 00:56:58 Federica Gazzelloni: some formula from the documentation: Formula to use in smoothing function, eg. y ~ x, y ~ poly(x, 2), y ~ log(x). NULL by default, in which case method = NULL implies formula = y ~ x when there are fewer than 1,000 observations and formula = y ~ s(x, bs = "cs") otherwise. 00:57:52 priyanka gagneja: thats ok , keep going. we can pick up the rest next time we meet. 00:58:21 Lydia Gibson: I’m going to run to my appointment. See you all next week! 00:58:41 Lydia Gibson: Sorry, in two weeks. 00:59:26 Ryan S: as a side note -- it seemed like the topic of stats (i.e., stat = "identity")…. this topic seemed to get very light treatment in the text. to me it seems like this idea of how stats work is a huge topic that requires a lot of understanding and practice. 00:59:52 Stan Piotrowski: I agree, Ryan. 01:00:18 Ryan S: suggest someone who understands this topic (and has the capacity to talk to it) may be willing to take 15 mins on it next time? 01:01:13 June Choe: I also agree (and would be happy to do this, just not this month!) I feel like we could use another week on stat before we're thrown into the Extending ggplot2 section - maybe around when we cover scales 01:02:25 Michael Haugen: Stat part 2 next week? 01:02:33 Stan Piotrowski: That sounds like a good idea to me! That’ll give us some time to dig into the code and figure out what’s going on 01:02:48 Ryan S: I think we're two weeks away (US holiday next week) 01:02:49 priyanka gagneja: +1 Michael "],["maps.html", "Chapter 5 Maps", " Chapter 5 Maps Learning Objectives: - Plot simple maps using geom_polygon() - Using simple features sf to plot GIS data geom_sf() - Work with map projections and underlying sf data structure - Draw maps using Raster data Plotting geospacial data is a common visualization task. The process may require spcialized tools. You can decompse the problem into two paths: - Using one data source to draw a map (if you have GIS data) - Adding metadata from another information source to the map (more common with relation to geographic areas) NOTE: X = Longitude, Y=Latitude. When pronounced “Lat/Lon” it is actually measured as Y/X. Not confusing….just keeping with vocabulary and measurements! "],["polygon-maps.html", "5.1 Polygon Maps", " 5.1 Polygon Maps The simplest approach to mapping is using geom_polygon(). This forms bounderies around regions. library(ggplot2) mi_counties <- map_data("county", "michigan") %>% select(lon = long, lat, group, id = subregion) head(mi_counties) ## lon lat group id ## 1 -83.88675 44.85686 1 alcona ## 2 -83.36536 44.86832 1 alcona ## 3 -83.36536 44.86832 1 alcona ## 4 -83.33098 44.83968 1 alcona ## 5 -83.30806 44.80530 1 alcona ## 6 -83.30233 44.77665 1 alcona In this data set we have four variables: - lat: Latitude of the vertex (as measured by horizontal paths) - long: Longitude of the vertex (as measured by vertical paths) - id: name of the region - group: unique identifier for contiguous areas within a region ggplot(mi_counties, aes(lon, lat)) + geom_point(size = .25, show.legend = FALSE) + coord_quickmap() ggplot(mi_counties, aes(lon, lat, group = group)) + geom_polygon(fill = "white", colour = "grey50") + coord_quickmap() In this plot, coord_quickmap() is used to adjust the axes to ensure longitude and latitude are rendered on the same scale For a more advanced use of ggplot2 for mapping, we’ll see the use of geom_sf() and coord_sf() to handle spatial data specified in simple features format. "],["simple-features-maps.html", "5.2 Simple Features Maps", " 5.2 Simple Features Maps You can use the above examples…but not real world pratical. Instead, most GIS data is written as simple features and produced by the (Open Geospatial Consortium]https://www.ogc.org/) 5.2.1 Layered Maps 5.2.2 Labelled Maps 5.2.3 Adding Other Geoms "],["map-projections.html", "5.3 Map Projections", " 5.3 Map Projections "],["working-with-sf-data.html", "5.4 Working with sf Data", " 5.4 Working with sf Data "],["raster-maps.html", "5.5 Raster Maps", " 5.5 Raster Maps "],["data-sources.html", "5.6 Data Sources", " 5.6 Data Sources "],["meeting-videos-5.html", "5.7 Meeting Videos", " 5.7 Meeting Videos 5.7.1 Cohort 1 Meeting chat log 00:11:25 June Choe: hello! 00:15:21 SriRam: Hi all, I am new here, I came to know about this from ISLR book club 00:16:14 Stan Piotrowski: Great have to have you here, SriRam! Some of us are also in the ISLR book club and I think this is a nice complement to that material 00:25:26 June Choe: I'd like to see the error! 00:26:54 June Choe: I think you'd have to add a geom_labe() layer 00:27:08 June Choe: but as Stan said it'll render text at every point 00:27:29 June Choe: after polygon would draw it on top 00:33:36 Michael Haugen: Reminds me of a Flight of the Concords episode 00:35:04 SriRam: 23.5 00:40:09 SriRam: It would be incorrect data to have multiple geometries on same record 00:47:44 Lydia Gibson: It’s spelled right… or at least that’s how it’s spelled in the book 00:48:16 June Choe: hm maybe the sf_label and sf_text layers also need to take the geometry aesthetic 00:49:02 June Choe: label.padding I think is from geom_label (the white space between text and bounding box) 00:52:51 Federica Gazzelloni: viridis 00:53:30 Federica Gazzelloni: scale_color_viridis() 00:53:42 Federica Gazzelloni: scale_fill_viridis() 00:54:01 SriRam: I think viridis is better for continuous values 00:54:57 Federica Gazzelloni: viridian package: https://cran.r-project.org/web/packages/viridis/vignettes/intro-to-viridis.html 00:55:08 Federica Gazzelloni: viridis package: https://cran.r-project.org/web/packages/viridis/vignettes/intro-to-viridis.html 00:55:51 Federica Gazzelloni: more: https://www.rdocumentation.org/packages/viridis/versions/0.5.1/topics/scale_color_viridis 00:56:21 Michael Haugen: David Robinson uses scale_fill_viridis_c() for a map in his most recent Tidy Tuesday Screen cast. See around 23minute mark: Tidy Tuesday live screencast: Analyzing registered nurses in R. https://www.youtube.com/watch?v=UVmxHb2Daeo&t=486s 01:12:20 Lydia Gibson: Thank you Ryan!! 01:12:34 Federica Gazzelloni: thanks Ryan 01:12:54 Stan Piotrowski: Thanks Ryan! Meeting chat log 00:09:20 priyanka gagneja: sorry everyone I just joined 00:09:38 Federica Gazzelloni: Hello! 00:10:02 priyanka gagneja: and will probably be a little in and out .. got a not so happy baby today at home 00:18:11 Stan Piotrowski: I need to take off for a conflict that just came up. Catch up with you all on slack! 00:20:46 SriRam: It is the image product ID 00:21:07 SriRam: All the IDE’s 00:21:50 Kent Johnson: The IDE codes are defined here: https://ropensci.github.io/bomrang/reference/get_available_imagery.html 00:27:32 SriRam: The process is called geo-referencing 00:27:58 SriRam: And image is called a geo-referenced image 00:31:33 SriRam: Yes, it is a reference system 00:31:38 SriRam: A coordinate reference 00:33:04 Federica Gazzelloni: this is the bit that makes the reference: crs = st_crs(sat_vis) 00:49:27 Jiwan Heo: something just came up, and have to leave. See you all next week! 00:59:58 priyanka gagneja: I am signing off now , can someone please address and sign off on my behalf. I will send a msg later on slack "],["networks.html", "Chapter 6 Networks", " Chapter 6 Networks Learning Objectives What is Network data? New functions and geoms Visualization of nodes and edges as abstract concepts "],["introduction-2.html", "6.1 Introduction", " 6.1 Introduction This chapter illustrates how to make a Network of data, and how to make practical examples using some of the available packages: {tidygraph} for Tidy API for Graph Manipulation {ggraph} for network visualization {igraph} for generating random and regular graphs "],["what-is-network-data.html", "6.2 What is network data?", " 6.2 What is network data? Networks data consists of entities (nodes or vertices) and their relation (edges or links). Edges can be: directed or undirected 6.2.1 A tidy network manipulation API The first package is tidygraph() a dplyr API for network data. New functions: activate() informs tidygraph on which part of the network you want to work on, either nodes or edges. .N() which gives access to the node data of the current graph even when working with the edges - .E() and .G() to access the edges or the whole graph) In this example we create a graph, assign a random label to the nodes, and sort the edges based on the label of their source node. The function play_erdos_renyi() creates graphs directly through sampling of different attributes. library(tidygraph) graph <- tidygraph::play_erdos_renyi(n = 10, p = 0.2) %>% activate(nodes) %>% mutate(class = sample(letters[1:4], n(), replace = TRUE)) %>% activate(edges) %>% arrange(.N()$class[from]) graph ## # A tbl_graph: 10 nodes and 14 edges ## # ## # A directed simple graph with 3 components ## # ## # Edge Data: 14 × 2 (active) ## from to ## <int> <int> ## 1 1 10 ## 2 2 1 ## 3 2 3 ## 4 4 10 ## 5 10 4 ## 6 2 5 ## 7 2 8 ## 8 9 2 ## 9 9 4 ## 10 9 10 ## 11 3 2 ## 12 3 10 ## 13 5 8 ## 14 8 9 ## # ## # Node Data: 10 × 1 ## class ## <chr> ## 1 a ## 2 a ## 3 c ## # ℹ 7 more rows 6.2.2 Conversion Data can be converted with as_tbl_graph(), a data structure for tidy graph manipulation. It converts a data frame encoded as an edgelist, as well as converting the result of hclust() data(highschool, package = "ggraph") head(highschool) ## from to year ## 1 1 14 1957 ## 2 1 15 1957 ## 3 1 21 1957 ## 4 1 54 1957 ## 5 1 55 1957 ## 6 2 21 1957 With as_tbl_graph() we obtain: hs_graph <- tidygraph::as_tbl_graph(highschool, directed = FALSE) hs_graph ## # A tbl_graph: 70 nodes and 506 edges ## # ## # An undirected multigraph with 1 component ## # ## # Node Data: 70 × 0 (active) ## # ## # Edge Data: 506 × 3 ## from to year ## <int> <int> <dbl> ## 1 1 13 1957 ## 2 1 14 1957 ## 3 1 20 1957 ## # ℹ 503 more rows 6.2.2.1 hclust() and dist() functions: In this example the luv_colours() function allows for all built-in colors() translated into Luv colour space, a data frame with 657 observations and 4 variables: luv_colours luv_colours <- as.data.frame(convertColor(t(col2rgb(colors())), "sRGB", "Luv")) luv_colours$col <- colors() head(luv_colours) ## L u v col ## 1 9341.570 -3.370649e-12 0.0000 white ## 2 9100.962 -4.749170e+02 -635.3502 aliceblue ## 3 8809.518 1.008865e+03 1668.0042 antiquewhite ## 4 8935.225 1.065698e+03 1674.5948 antiquewhite1 ## 5 8452.499 1.014911e+03 1609.5923 antiquewhite2 ## 6 7498.378 9.029892e+02 1401.7026 antiquewhite3 This visualization represent the content of the dataset, then we will see how it looks in a grapg representation. ggplot(luv_colours, aes(u, v)) + geom_point(aes(colour = col), size = 3) + scale_color_identity() + coord_equal() + theme_void() For example, selecting the first 3 variables and plotting the data with the plot() function we can see that there are some connections within the elements of the dataset, as the colors are connected to each other. ggplot2::luv_colours[, 1:3] %>% head ## L u v ## 1 9341.570 -3.370649e-12 0.0000 ## 2 9100.962 -4.749170e+02 -635.3502 ## 3 8809.518 1.008865e+03 1668.0042 ## 4 8935.225 1.065698e+03 1674.5948 ## 5 8452.499 1.014911e+03 1609.5923 ## 6 7498.378 9.029892e+02 1401.7026 plot(ggplot2::luv_colours[, 1:3]) luv_clust <- hclust(dist(ggplot2::luv_colours[, 1:3])) class(luv_clust) ## [1] "hclust" With the tidygraph::as_tbl_graph() function we can transorm the dataset into classes “tbl_graph”, “igraph” to make it ready to use for making a visualization of the network data. luv_graph <- as_tbl_graph(luv_clust) luv_graph;class(luv_graph) ## # A tbl_graph: 1313 nodes and 1312 edges ## # ## # A rooted tree ## # ## # Node Data: 1,313 × 4 (active) ## height leaf label members ## <dbl> <lgl> <chr> <int> ## 1 0 TRUE "101" 1 ## 2 0 TRUE "427" 1 ## 3 778. FALSE "" 2 ## 4 0 TRUE "571" 1 ## 5 0 TRUE "426" 1 ## 6 0 TRUE "424" 1 ## 7 0 TRUE "425" 1 ## 8 0 FALSE "" 2 ## 9 590. FALSE "" 3 ## 10 1652. FALSE "" 4 ## # ℹ 1,303 more rows ## # ## # Edge Data: 1,312 × 2 ## from to ## <int> <int> ## 1 3 1 ## 2 3 2 ## 3 8 6 ## # ℹ 1,309 more rows ## [1] "tbl_graph" "igraph" 6.2.3 Algorithms The real benefit of networks comes from the different operations that can be performed on them using the underlying structure. luv_graph %>% tidygraph::activate(nodes) %>% mutate(centrality = centrality_pagerank()) %>% arrange(desc(centrality)) ## # A tbl_graph: 1313 nodes and 1312 edges ## # ## # A rooted tree ## # ## # Node Data: 1,313 × 5 (active) ## height leaf label members centrality ## <dbl> <lgl> <chr> <int> <dbl> ## 1 0 TRUE 207 1 0.000763 ## 2 0 TRUE 315 1 0.000763 ## 3 0 TRUE 208 1 0.000763 ## 4 0 TRUE 316 1 0.000763 ## 5 0 TRUE 205 1 0.000763 ## 6 0 TRUE 313 1 0.000763 ## 7 0 TRUE 206 1 0.000763 ## 8 0 TRUE 314 1 0.000763 ## 9 0 TRUE 245 1 0.000763 ## 10 0 TRUE 353 1 0.000763 ## # ℹ 1,303 more rows ## # ## # Edge Data: 1,312 × 2 ## from to ## <int> <int> ## 1 1187 1079 ## 2 1187 1080 ## 3 942 797 ## # ℹ 1,309 more rows "],["visualizing-networks.html", "6.3 Visualizing networks", " 6.3 Visualizing networks To visualize the Network data we use {ggraph}. It builds on top of {tidygraph} and {ggplot2} to allow a complete and familiar grammar of graphics for network data. 6.3.1 Setting up the visualization Syntax of {ggraph}: ggraph() %>% ggraph::geom_<functions> it will choose an appropriate layout based on the type of graph you provide. Getting Started guide to layouts 6.3.1.1 Specifying a layout What is the base requirenment? The data frame need to be with at least an x and y column and with the same number of rows as there are nodes in the input graph. As an example we take the data(highschool, package = \"ggraph\") and make a visualization of the graph: hs_graph <- tidygraph::as_tbl_graph(highschool, directed = FALSE) library(ggraph) ggraph(hs_graph) + geom_edge_link() + geom_node_point() A second example is with more features: hs_graph <- hs_graph %>% tidygraph::activate(edges) %>% mutate(edge_weights = runif(n())) ggraph(hs_graph, layout = "stress", weights = edge_weights) + geom_edge_link(aes(alpha = edge_weights)) + geom_node_point() + scale_edge_alpha_identity() In the following examples we see different layouts. Information about “drl” type of layout: DRL force-directed graph layout, an be found in the igraph package. layout <- ggraph::create_layout(hs_graph, layout = 'drl') ggraph(layout) + geom_edge_link() + geom_node_point() Instead of {tidygraph} we use {igraph}, with layout = “kk”: layout.kamada.kawai require(ggraph) require(igraph) hs_graph2 <- igraph::graph_from_data_frame(highschool) layout <- create_layout(hs_graph2, layout = "kk") ggraph(layout) + geom_edge_link(aes(colour = factor(year))) + geom_node_point() A very simple example to understand how to make a graph network is from this tutorial: Networks in igraph To understand a bit more about the graph structure we can use these functions: g1 <- igraph::graph( edges=c(1,2, 2,3, 3, 1), n=3, directed=F ) E(g1); # access to the edges ## + 3/3 edges from 5c39927: ## [1] 1--2 2--3 1--3 V(g1); # the vertics ## + 3/3 vertices, from 5c39927: ## [1] 1 2 3 g1[] # access to the matrix ## 3 x 3 sparse Matrix of class "dgCMatrix" ## ## [1,] . 1 1 ## [2,] 1 . 1 ## [3,] 1 1 . 6.3.1.2 Circularity Layouts can be linear and circular. coord_polar() changes the coordinate system and not affect the edges ggraph(luv_graph, layout = 'dendrogram', circular = TRUE) + geom_edge_link() + coord_fixed() ggraph(luv_graph, layout = 'dendrogram') + geom_edge_link() + coord_polar() + scale_y_reverse() 6.3.2 Drawing nodes points more specialized geoms: tiles geom_node_<functions> geom_node_point() geom_node_tile() Getting Started guide to nodes ggraph(luv_graph, layout = "stress") + geom_edge_link() + geom_node_point(aes(colour =factor(members)), show.legend = F) More features could be added to calculate node and edge centrality, such as: centrality_power() centrality_degree() ggraph(luv_graph, layout = "stress") + geom_edge_link() + geom_node_point(aes(colour =centrality_power())) Or making tiles: ggraph(luv_graph, layout = "treemap") + geom_node_tile(aes(fill = depth)) 6.3.3 Drawing edges geom_edge_link() draws a straight line between the connected nodes, actually what it does is: it will split up the line in a bunch of small fragments. geom_edge_link() geom_edge_link2() geom_edge_fan() geom_edge_parallel() geom_edge_elbow() geom_edge_bend() geom_edge_diagonal() Getting Started guide to edges The after_stat(index): set.seed(123) ggraph(hs_graph, layout = "stress") + geom_edge_link(aes(alpha = after_stat(index))) Here is an example about how to use node.class variable, the graph is the first that we have seen and it is artificially made with: tidygraph::play_erdos_renyi() graph <- tidygraph::play_erdos_renyi(n = 10, p = 0.2) %>% activate(nodes) %>% mutate(class = sample(letters[1:4], n(), replace = TRUE)) %>% activate(edges) %>% arrange(.N()$class[from]) ggraph(graph, layout = "stress") + geom_edge_link2( aes(colour = node.class), width = 3, lineend = "round") ggraph(hs_graph, layout = "stress") + geom_edge_parallel() Trees and specifically dendrograms: ggraph(luv_graph, layout = "dendrogram", height = height) + geom_edge_elbow() 6.3.3.1 Clipping edges around the nodes Example: using arrows to show directionality of edges set.seed(1011) ggraph(graph, layout = "stress") + geom_edge_link( arrow = arrow(), start_cap = circle(5, "mm"), end_cap = circle(5, "mm") ) + geom_node_point(aes(colour = class), size = 8) 6.3.3.2 An edge is not always a line Nodes and edges are abstract concepts and can be visualized in a multitude of ways. geom_edge_point() ggraph(hs_graph, layout = "matrix", sort.by = node_rank_traveller()) + geom_edge_point() 6.3.4 Faceting facet_nodes() facet_edges() facet_graph() ggraph(hs_graph, layout = "stress") + geom_edge_link() + geom_node_point() + facet_edges(~year) "],["conclusions.html", "6.4 Conclusions", " 6.4 Conclusions Making a {ggraph} means understanding of the different classes of datasets that can be used inside the function. Also, very important is to have clear in mind the structure of the graph that you would like to acheive for representing your data. There are many layouts available, and they differ by the class of provided data. In addition, to do not forget that you can make a network of data using {ggplot2} as well. 6.4.1 Resources: tidygraph website Data Imaginist Imaginist layouts Network analysis with r R and igraph Getting Started guide to layouts Getting Started guide to nodes Getting Started guide to edges "],["meeting-videos-6.html", "6.5 Meeting Videos", " 6.5 Meeting Videos 6.5.1 Cohort 1 Meeting chat log 00:09:20 priyanka gagneja: sorry everyone I just joined 00:09:38 Federica Gazzelloni: Hello! 00:10:02 priyanka gagneja: and will probably be a little in and out .. got a not so happy baby today at home 00:18:11 Stan Piotrowski: I need to take off for a conflict that just came up. Catch up with you all on slack! 00:20:46 SriRam: It is the image product ID 00:21:07 SriRam: All the IDE’s 00:21:50 Kent Johnson: The IDE codes are defined here: https://ropensci.github.io/bomrang/reference/get_available_imagery.html 00:27:32 SriRam: The process is called geo-referencing 00:27:58 SriRam: And image is called a geo-referenced image 00:31:33 SriRam: Yes, it is a reference system 00:31:38 SriRam: A coordinate reference 00:33:04 Federica Gazzelloni: this is the bit that makes the reference: crs = st_crs(sat_vis) 00:49:27 Jiwan Heo: something just came up, and have to leave. See you all next week! 00:59:58 priyanka gagneja: I am signing off now , can someone please address and sign off on my behalf. I will send a msg later on slack Meeting chat log 00:15:36 Ryan S: https://www.youtube.com/playlist?list=PLkrJrLs7xfbWjD2rp3pIV85lby-tR3Cnu 00:15:48 Lydia Gibson: Thanks Ryan! 00:16:00 Ryan S: link to a very good basic tutorial on simple features 00:53:58 Lydia Gibson: What is GPU? 00:54:23 Ryan S: GPU is the graphics processing unit (I think) 00:54:31 Lydia Gibson: Thank you 00:54:32 Ryan S: it's the part that "draws" on your screen 00:54:43 Lydia Gibson: Oh okay 00:54:56 Ryan S: versus the CPU that does calculations 00:55:48 SriRam: For 2D and non texture plots, I think it is more a RAM issue 00:55:59 Ryan Metcalf: Oh. I’m so sorry for using Acronyms! Ryan S. is correct. The balance I’m asking Federica is related….”Can I use a slow Laptop or do I have to use a super computer with massive Video card to render these types of graphical objects. 00:56:23 Lydia Gibson: I always thought CPU was synonymous with computer. 00:57:22 Ryan S: Ryan, just repurpose the GPUs you currently have that are mining crypto 00:57:38 Ryan Metcalf: :) Agreed!!! 00:57:57 SriRam: Lol 00:59:14 SriRam: If you have a spatial network, do not miss out on “sfnetworks” package 01:00:07 Federica Gazzelloni: https://kateto.net/netscix2016.html 01:00:14 Federica Gazzelloni: https://www.data-imaginist.com/2017/ggraph-introduction-layouts/ 01:00:24 Federica Gazzelloni: https://www.hcbravo.org/networks-across-scales/misc/tidygraph.nb.html 01:00:41 Federica Gazzelloni: https://igraph.org/r/doc/layout_with_drl.html 01:00:48 Federica Gazzelloni: https://tidygraph.data-imaginist.com/reference/index.html#section-misc 01:00:57 Federica Gazzelloni: https://ggraph.data-imaginist.com/articles/Layouts.html 01:01:52 Federica Gazzelloni: https://web.stanford.edu/class/bios221/book/Chap-Graphs.html https://github.com/jtichon/ModernStatsModernBioJGT/tree/master/data https://simplemaps.com/data/world-cities 01:07:00 SriRam: Tidy is I think , Hadley definition, variable is a column, sample point is a row 01:07:36 SriRam: Sorry my microphone does not work since a few sessions now 🙁 01:07:50 Ryan S: borders on "marketing" to some degree. :) 01:08:08 SriRam: Sfnetwork is not for graphs, it is more for spatial operations "],["annotations.html", "Chapter 7 Annotations", " Chapter 7 Annotations Learning Objectives Plot and Axis Titles; Providing context for the visual, and changing the look of plot elements and overall appearance Text Labels; mapping text from data or having text appear on graphs as data Building Custom Annotations; how to write summaries, context, arrows, and textual meta data to graphs Direct Labeling and Faceting; related packages for special issues such as higlighting, textboxes, html text "],["introduction-3.html", "7.1 Introduction", " 7.1 Introduction ] Packages - ggtext - ggtheme - gghighlight - palmerpenguins - ggrepel - grid Functions - geom_text - geom_label - theme(plot.title = element_text()) - geom = “curve” - geom_vline Resource - A ggplot Tutorial For Beautiful Plotting in R by Cedric Scherer August 5, 2019. Annotation Definitions “Conceptually, an annotation supplies metadata for the plot: that is, it provides additional information about the data being displayed. From a practical standpoint, however, metadata is just another form of data. Because of this, the annotation tools in ggplot2 reuse the same geoms that are used to create other plots.” Wickham, H., Navarro, N., & Lin Pedersen, T. (2016). Ggplot2: Elegant graphics for data analysis (Second ed.) Springer. “[Annotation] concerns judging the level of assistance an audience may require in order to understand the background, function and purpose of a project, as well as what guidance needs to be provided to help viewers perceive and interpret the data representations.” Kirk, Andy. Data Visualisation (p. 231). SAGE Publications. Kindle Edition. "],["plot-and-axis-titles.html", "7.2 Plot and Axis Titles", " 7.2 Plot and Axis Titles base <- ggplot(penguins, aes(bill_length_mm, bill_depth_mm, color = species, shape = species)) + geom_point(alpha = .4) + geom_point(data = gd, size = 4) + theme_bw() + labs( title = "How does Bill Size Differ by species?", subtitle = "Source: Palmer Station Antarctica LTER and K. Gorman, 2020", x = "*Length*", y = "Width", caption = "ggplot 2 Book Club") + theme(plot.title = element_text(color = "midnightblue", hjust = .5, face = "bold")) + theme(plot.subtitle = element_text(hjust = .5, size = 9)) + theme(axis.title.x = ggtext::element_markdown()) line breaks quote() for mathamatical expressions. ?plotmath removing labels two ways: labs(x = ““) and labs(x = NULL) "],["text-labels.html", "7.3 Text labels", " 7.3 Text labels 8.2 Text labels - geom_text() - geom_text() adds label text to the x and y coorindates of a graph such as name instead of a circle in a scatter plot. Change the font with the family aesthetic The packages showtext and extrafont can help with handling fonts across differnet devises Change the fontface aesthetic for plain, bold, or italic “faces”. Alignment: hjust (“left”, “center”, “right”, “inward”, “outward”) and vjust (“bottom”, “middle”, “top”, “inward”, “outward”) aesthetics. vjust = “inward”, hjust = “inward” ensures labels stay in the plot geom_text(aes(label = text), vjust = “inward”, hjust = “inward”) df <- data.frame(x = 1, y = 3:1, face = c("plain", "bold", "italic")) ggplot(df, aes(x, y)) + geom_text(aes(label = face, fontface = face, ), vjust = "inward", hjust = "inward", size = 20, angle = 10) base + geom_text(aes(label = body_mass_g), check_overlap = TRUE) base + geom_label(aes(label = body_mass_g)) ggplot(mpg, aes(displ, hwy)) + geom_text(aes(label = model)) + xlim(1, 8) ggplot(mpg, aes(displ, hwy)) + geom_text(aes(label = model)) + xlim(1, 8) ggplot(mpg, aes(displ, hwy)) + geom_text(aes(label = model), check_overlap = TRUE) + xlim(1, 8) library(ggrepel) ggplot(mpg, aes(displ, hwy)) + geom_text_repel(aes(label = model)) + xlim(1, 8) label <- data.frame( waiting = c(55, 80), eruptions = c(2, 4.3), label = c("peak one", "peak two") ) ggplot(faithfuld, aes(waiting, eruptions)) + geom_tile(aes(fill = density)) + geom_label(data = label, aes(label = label)) geom_label "],["annotations-1.html", "7.4 Annotations", " 7.4 Annotations 8.3 Annotations - ggplot2 annotation options - geom_text and geom_label geom_rect() geom_line(), geom_path(), geom_segment(), arrow() geom_vline(), geom_hline(), geom_abline() annotate() which can be used in combination with arrow() base + annotate( geom = "text", x = 42, y = 20, label = "The Adelie species is on all 3 islands", size = 5, color = "darkcyan") Arrows Code base + annotate( geom = "curve", x = 53, y = 20, xend = 49, yend = 18.5, curvature = .3, size = 1, arrow = arrow(length = unit(3, "mm")) ) + annotate(geom = "text", x = 53.1, y = 20, label = "Average Chinstrap", hjust = "left", size = 4, color = "darkcyan") + annotate( geom = "curve", x = 35, y = 20, xend = 38, yend = 18.5, curvature = .3, size = 1, arrow = arrow(length = unit(3, "mm")) ) + annotate(geom = "text", x = 32, y = 20.3, label = "Average Adelie", hjust = "left", size = 4, color = "darkcyan") + annotate( geom = "curve", x = 53, y = 15, xend = 48, yend = 15, curvature = .3, size = 1, arrow = arrow(length = unit(3, "mm")) ) + annotate(geom = "text", x = 53, y = 15.3, label = "Average Gentoo", hjust = "left", size = 4, color = "darkcyan") Arrows Plot base + annotate( geom = "curve", x = 53, y = 20, xend = 49, yend = 18.5, curvature = .3, size = 1, arrow = arrow(length = unit(3, "mm")) ) + annotate(geom = "text", x = 53.1, y = 20, label = "Average Chinstrap", hjust = "left", size = 4, color = "darkcyan") + annotate( geom = "curve", x = 35, y = 20, xend = 38, yend = 18.5, curvature = .3, size = 1, arrow = arrow(length = unit(3, "mm")) ) + annotate(geom = "text", x = 32, y = 20.3, label = "Average Adelie", hjust = "left", size = 4, color = "darkcyan") + annotate( geom = "curve", x = 53, y = 15, xend = 48, yend = 15, curvature = .3, size = 1, arrow = arrow(length = unit(3, "mm")) ) + annotate(geom = "text", x = 53, y = 15.3, label = "Average Gentoo", hjust = "left", size = 4, color = "darkcyan") + theme(legend.position = "none") astronauts %>% filter(nationality %in% c("U.S.","Australia", "U.K.", "U.S.S.R/Russia", "Japan")) %>% ggplot(aes(x = nationality, y = hours_mission, color = hours_mission)) + coord_flip() + geom_point(size = 4, alpha = 0.15) + geom_boxplot(color = "gray60", outlier.alpha = 0) + stat_summary(fun = mean, geom = "point", size = 5, color = "dodgerblue") + annotate( geom = "curve", x = 3.8, y = 2500, xend = 4, yend = 650, curvature = .3, arrow = arrow(length = unit(2, "mm")) ) + annotate( "text", x = 3.7, y = 2500, label = "The U.S. Mean Hours Mission", size = 2.7) + annotate( geom = "curve", x = 4.7, y = 4200, xend = 5, yend = 2800, curvature = .3, arrow = arrow(length = unit(2, "mm")) ) + annotate( "text", x = 4.5, y = 3700, label = "The interquartile range, between 25% and 75% of values", size = 2.8) + annotate( geom = "curve", x = 1, y = 3800, xend = 1, yend = 900, curvature = .3, arrow = arrow(length = unit(2, "mm")) ) + annotate( "text", x = .8, y = 3000, label = "Australian Astronaut Andrew S. W. Thomas completed missions in 1983, 1998, 2001, 2005 and is now retired", size = 2.8) + scale_color_viridis_c() + scale_y_continuous(limits = c(0, 5000)) + labs(title = "Length of Astronaut Missions in hours", subtitle = "A Study was conducted on the effects of space on various individuals", caption = "Source: TidyTuesday 2020 week 29 \\n inspired by plots in The Evolution of a ggplot (ep1) by Cedric Scherer") + theme_fivethirtyeight() + theme(legend.position = "none") + theme(plot.title = element_text(hjust = .5)) + theme(plot.subtitle = element_text(hjust = .5)) "],["directlabels-package.html", "7.5 Directlabels Package", " 7.5 Directlabels Package Place labels closer to the data than legends ggforce() gghighlight() Base Code Nurse Salary library(ggthemes) library(scales) library(ggthemes) library(scales) g <- nurses %>% group_by(year) %>% filter(state %in% c("Minnesota", "Wisconsin", "Iowa", "North Dakota", "Illinois", "Indiana", "Kansas", "Michigan", "Missouri", "Nebraska", "Ohio")) %>% ggplot(aes(year, annual_salary_median, color = state, )) + geom_line() + labs( title = "Annual Median RN Salary by Midwestern State" ) + theme(legend.position = "none") + geom_vline(xintercept = c(2007, 2009), size = 1.5, color = "darkgoldenrod1", linetype = "dashed") + gghighlight::gghighlight(state == c("Minnesota", "Wisconsin", "Iowa")) + theme_economist() + scale_color_economist(name = NULL) + theme(axis.title = element_blank()) + scale_y_continuous(labels = comma_format()) gghighlight and facets base + gghighlight::gghighlight() + facet_wrap(~ species) examples in geom_richtext library(ggtext) lab_html <- "★ geom_richtext can modify with hmtl" g + geom_richtext(aes(x = 2010, y = 50000, label = lab_html), stat = "unique", angle = 30, color = "white", fill = "steelblue") geom_textbox lab_long <- "**The Great Recession** <br><b style='font-size:10pt;color:steelblue;'> Minnesota's RN Annual Salaries increased during the great receision and then completely flatted out before rising again after 2015" g + geom_textbox(aes(x = 2015, y = 40000, label = lab_long), width = unit(15, "lines"), stat = "unique") "],["faceting-annotations.html", "7.6 Faceting Annotations", " 7.6 Faceting Annotations g + facet_wrap(~state, scales = "free_x") Grid package scales coordinates between 0 and 1 library(grid) my_grob <- grobTree(textGrob("Great Recession", x = .2, y = .9, hjust = 0, gp = gpar(col = "black", fontsize = 10, fontface = "bold"))) g + annotation_custom(my_grob) + facet_wrap(~state, scales = "free_x") "],["resources-1.html", "7.7 Resources", " 7.7 Resources ggplot 2 book chapter 8 annotations A ggplot Tutorial For Beautiful Plotting in R by Cedric Scherer August 5, 2019 The Evolution of a ggplot (EP.1) by Cedric Scherer Introduction to gghighlight by Hiroaki Yutani 2021-06-05 "],["meeting-videos-7.html", "7.8 Meeting Videos", " 7.8 Meeting Videos 7.8.1 Cohort 1 Meeting chat log 00:10:14 Ed: Hi everyone. My connection is shaky so if I drop off don’t take it personally. 😇 00:10:32 Michael Haugen: Thanks for joining us! 00:10:42 Ryan Metcalf: Great to see you. No worries at all. 00:24:25 Ryan Metcalf: To support Michael’s quote, I mentioned a Swedish Statician…Hans Rosling. The Gapminder project was his brain child. Great Ted Talks were delivered by the user: https://www.ted.com/speakers/hans_rosling 00:32:42 June Choe: re: text/font rendering - {ragg} + {systemfonts} is now recommended over {showtext}/{extrafont}! 00:32:59 June Choe: https://yjunechoe.github.io/posts/2021-06-24-setting-up-and-debugging-custom-fonts/ 00:33:39 Federica Gazzelloni: @June thanks 00:34:28 June Choe: here's some quotes from Thomas Lin Pedersen (ggplot2 dev) on showtext/extrafont - https://twitter.com/thomasp85/status/1355083725156077571 https://twitter.com/thomasp85/status/1261539815960518656 00:39:31 Ed: So is it necessary to hard code the locations for those arrows? It won't stop them where it makes sense to go? 00:39:46 Ed: What about different resolution screens, etc. 00:41:36 Kent Johnson: Yes, you have to hard-code the arrow start and end. 00:42:09 Ed: 👍 00:42:46 Kent Johnson: My experience is, it's pretty fiddly to get something really nice. I don't know how plot size / screen resolution affect the arrows. 00:43:42 Ryan Metcalf: https://fivethirtyeight.com/ 00:46:42 June Choe: linewidth and arrow size would be subject to resolution but not the stard/end points 00:47:07 June Choe: start/end points are converted to native coordinate units but size is absolute 00:47:46 Ed: 👍 00:48:03 June Choe: (which is why you should never rely just on plot panel output and always use something like ggsave!) 00:48:58 Ed: Awesome tip. Could see myself getting frustrated but good to know going into it. 00:49:35 June Choe: since like an update or two ago, ggsave() started returning the path to the saved image invisibly, so if you 00:50:07 June Choe: if you're on windows, you can do something like `system2("open", ggsave("img.png"))` and itll open up the plot after saving it 00:50:27 June Choe: (open it back up using your system's default photo viewing app) 00:58:21 Ryan Metcalf: Sheesh! This took me forever to find! I mentioned Arrows outside of a graphic. I was using it with D3 objects (similar to ggplot2). https://github.com/krispo/yarrow 01:01:04 June Choe: big fan - and you should check out {sinab} as well for a more powerful version of ggtext by the same dev (though this one's heavily experimental and requires Rust) - https://clauswilke.com/sinab/ 01:01:18 Michael Haugen: thanks 01:03:14 June Choe: the 0-1 coord scale in grid here is called "npc" (Normalized Parent Coordinates) 01:04:21 Ryan Metcalf: June, you are a wealth of knowledge! 🙂I may ping you outside of Zoom (Slack) for further discussions on Graphical Objects. 01:05:00 Ryan S: Awesome job Michael! 01:05:12 June Choe: For sure @Ryan ! Always happy to talk about data viz 01:05:15 June Choe: and thanks for presenting Michael! 01:05:50 June Choe: xaringanExtra i think 01:06:22 June Choe: https://pkg.garrickadenbuie.com/xaringanExtra/#/extra-styles 01:07:31 Federica Gazzelloni: Thanks Michael "],["arranging-plots.html", "Chapter 8 Arranging Plots", " Chapter 8 Arranging Plots Learning Objectives Produce several subplots part of the same main visualization A range of packages for providing different approaches to arranging separate plots "],["introduction-4.html", "8.1 Introduction", " 8.1 Introduction This chapter focuses on making more than one plot in one visualization, using the following packages: patchwork cowplot gridExtra ggpubr "],["arranging-plots-side-by-side-with-no-overlap.html", "8.2 Arranging plots side by side with no overlap", " 8.2 Arranging plots side by side with no overlap 8.2.1 Taking control of the layout More compositions: 8.2.2 More about layouts This way is possible a custom modification of the theme for one plot or for both. 8.2.3 Plot annotations "],["arranging-plots-on-top-of-each-other.html", "8.3 Arranging plots on top of each other", " 8.3 Arranging plots on top of each other It is possible to arrange plots in a way that they are nested to each other, as well as setting the position inside the main plot. General options are left, right, top, and bottom locations, but more specific locations can be set, such as using: grid::unit() (default uses npc units which goes from 0 to 1) In addition, the location is by default set to the panel area, but can be align_to` plot area. An inset can be placed exactly 15 mm from the top right corner. "],["extra.html", "8.4 Extra", " 8.4 Extra grid and gridExtra packages cowplot package To add a common title we use `ggdraw() ggpubr package "],["conclusions-1.html", "8.5 Conclusions", " 8.5 Conclusions Patchwork - imaginist is one of the packages mentioned in the book, also some other packages provide same results with different approaches. 8.5.1 Extra resources: grid and gridExtra cowplot ggpubr "],["meeting-videos-8.html", "8.6 Meeting Videos", " 8.6 Meeting Videos 8.6.1 Cohort 1 Meeting chat log 00:27:47 Lydia Gibson: What are npc unts? 00:27:56 Michael Haugen: "npc" (Normalized Parent Coordinates) 00:28:02 Michael Haugen: 0 to 1 00:28:07 Lydia Gibson: Oh okay. Thank you! 00:28:16 Michael Haugen: Same thing that was used for faceting annotations. 00:28:43 Michael Haugen: so .8 is l80 percent of the way up the y axis for example. 00:28:47 Lydia Gibson: I missed annotations last week. I’ll have to go back and watch the session. 00:43:21 SriRam: I use patch and cowplot 00:50:00 Kent Johnson: https://www.cedricscherer.com/2019/08/05/a-ggplot2-tutorial-for-beautiful-plotting-in-r/ 00:50:09 Lydia Gibson: Thank you! 00:50:17 Michael Haugen: The arrows are in chapter 8.3 with geom and curve for example, annotate( geom = "curve", x = 4, y = 35, xend = 2.65, yend = 27, curvature = .3, arrow = arrow(length = unit(2, "mm")) ) + 00:51:16 Michael Haugen: and arrows came up in the discussion as a discussion of the GROB and arrows and how to render your plot so the arrows are not distorted. 00:51:16 Ryan Metcalf: Perfect! “arrow” was the argument I was after! 00:51:37 Michael Haugen: And then we talked about ggsave as a part of that 00:53:32 Michael Haugen: we all will be at Cedric’s level by the end of this bookclub right? 00:53:47 SriRam: :D 00:53:52 Lydia Gibson: Hopefully lol 00:54:20 Ryan S: Thank you! 00:54:29 SriRam: Thank you Meeting chat log 00:04:54 June Choe: https://yjunechoe.github.io/ggtrace-talk/ 00:12:16 Ryan S: brilliant... didn't know this before but really simplifies the concept 00:14:08 Michael Haugen: Makes sense 00:26:15 June Choe: https://ggplot2.tidyverse.org/reference/aes_eval.html 00:26:36 Ryan Metcalf: I’m thinking in the context….I buy a car. The engineers have optimized it for longevity….but I want a hot rod….So I need to open the hood and change parts. Or, access the computer and start changing parameters. 00:32:54 SriRam: This is like scuba diving, more beautiful under the surface :) 00:33:15 Stan Piotrowski: Great analogy, SriRam! 00:33:17 Ryan Metcalf: Completely agree @SriRam! 00:36:44 June Choe: ggplot2:::ggplot_build.ggplot 00:37:58 June Choe: ggplot2:::print.ggplot 00:50:22 Federica Gazzelloni: thanks June!!! 00:50:54 SriRam: Out of curiosity, how much of this trickery (internal functions) can be learnt from "advanced R" or are these mentioned in the ggplot book ? I am just a regular user, I may not go this deep, but looks very interesting to explore/read during the Christmas break 00:51:12 Stan Piotrowski: I’m in the same boat as SriRam 00:51:32 Stan Piotrowski: Curious to know more about this but can definitely see myself getting lost in a rabbit hole 00:54:23 Ryan S: at some point -- maybe a different session -- can we dive deep into the different stat options ("identity", "count", etc.) 00:54:47 Ryan S: specifically, what do they do and when would you use them 00:55:09 Ryan Metcalf: June, this is amazing! 00:57:27 SriRam: Countdown starts...... 5 mins to come back to reality !!! :D 01:02:21 Stan Piotrowski: Great talk, June! 01:02:34 Kent Johnson: Thank you! See you next week! "],["position-scales-and-axes.html", "Chapter 9 Position scales and axes", " Chapter 9 Position scales and axes Learning Objectives What are the defining components of a scale? When/why does the data need to be transformed for a visualization? What are the defining components of an axis? What is the relationship between scale and axis? "],["introduction-preliminaries-asides.html", "9.1 Introduction / preliminaries / asides", " 9.1 Introduction / preliminaries / asides This chapter introduces position scales and axes. It may also be helpful to understand position scales and axes as position scales and guides, because axes they share the same API as guides for non-positional scales like color legends. The parallel will be clearer in the next chapter. It’s worthwhile to read documentations of the {scales} package to learn more about scales, since that handles a lot of the (re-)scaling and transformation under the hood. It may be good to start with the rstudio::conf2020 talk on scales. It should also be noted that there’s some discussion about revamping the scales_* API. See issue #4269 and PR #4271 Lastly, a small aside on the book’s after_stat() example it he intro, continuing nicely from our discussion on ggplot internals last week. ## [1] "StatBin" ## Aesthetic mapping: ## * `x` -> `after_stat(count)` ## * `y` -> `after_stat(count)` ## * `weight` -> 1 ## [1] "x|y" ## Aesthetic mapping: ## * `x` -> `displ` "],["numeric.html", "9.2 10.1 Numeric", " 9.2 10.1 Numeric 9.2.1 10.1.1 Limits The book doesn’t have content for this section (??) But we know that you can set limits with xlim()/ylim() or scale_x|y_*(limits = ) 9.2.2 10.1.2 Out of bounds values NOTE: A big theme of the {scales} package as of v1.1.1 (May 2020) is that they have very transparent function names. For example, the family of functions for Out Of Bounds (oob) handling are all named oob_*(). This is an intentional (re-)design of the package to work nicely with autocomplete. ## [1] "oob_censor" "oob_censor_any" "oob_discard" ## [4] "oob_keep" "oob_squish" "oob_squish_any" ## [7] "oob_squish_infinite" By default, data outside scales are set to NA. This is because the oob argument is set to oob_censor()/censor(). Note that oob only applies to continuous scales, since values of a discrete scale form a fixed set. ## { ## call <- caller_call() ## if (scale_override_call(call)) { ## call <- current_call() ## } ## sc <- continuous_scale(ggplot_global$x_aes, palette = identity, ## name = name, breaks = breaks, n.breaks = n.breaks, minor_breaks = minor_breaks, ## labels = labels, limits = limits, expand = expand, oob = oob, ## na.value = na.value, transform = transform, trans = trans, ## guide = guide, position = position, call = call, super = ScaleContinuousPosition) ## set_sec_axis(sec.axis, sc) ## } ## censor Book’s examples: Equivalent solutions with oob_*() You can use oob functions for non-positional scales 9.2.3 10.1.3 Visual range expansion Book examples: With expansion() from v3.3.0 (Dec 2020) ## $mult ## [1] 0 ## ## $add ## [1] 0 9.2.4 10.1.4 Exercises 9.2.5 10.1.5 Breaks ## [1] "breaks_extended" "breaks_hms" "breaks_log" "breaks_pretty" ## [5] "breaks_timespan" "breaks_width" Book example: ## const up txt big log ## 1 1 1 a 1000 2 ## 2 1 2 b 2000 5 ## 3 1 3 c 3000 10 ## 4 1 4 d 4000 2000 Demo from {scales}: ## scale_x_continuous(breaks = scales::breaks_extended()) ## scale_x_continuous(breaks = scales::breaks_extended(n = 2)) ## scale_x_continuous(NULL) At the vector level: ## [1] 1000 2000 3000 4000 ## [1] 1000 4000 Other breaks: ## [1] 0 25 50 75 100 ## [1] 0 10 20 30 40 50 60 70 80 90 100 110 ## [1] 0 20 40 60 80 100 120 ## [1] 1 10 100 1000 Debugging arguments in scale_*() that take function factories 9.2.6 10.1.6 Minor breaks Book example: ## [1] 1 2 3 4 5 6 7 8 9 10 20 30 ## [13] 40 50 60 70 80 90 100 200 300 400 500 600 ## [25] 700 800 900 1000 2000 3000 4000 5000 6000 7000 8000 9000 ## [37] 10000 There are also minor break functions: ## [1] "minor_breaks_n" "minor_breaks_width" 9.2.7 10.1.7 Labels ## [1] "label_bytes" "label_comma" "label_currency" ## [4] "label_date" "label_date_short" "label_dollar" ## [7] "label_log" "label_math" "label_number" ## [10] "label_number_auto" "label_number_si" "label_ordinal" ## [13] "label_parse" "label_percent" "label_pvalue" ## [16] "label_scientific" "label_time" "label_timespan" ## [19] "label_wrap" Book examples: 9.2.8 10.1.8 Exercises 9.2.9 10.1.9 Transformations Book example: The transformation is carried out by a “transformer”, which describes the transformation, its inverse, and how to draw the labels. You can construct your own transformer using scales::trans_new() Case study: make reversed log x-axis ## Transformer: log-10 [1e-100, Inf] ## Transformer: reverse [-Inf, Inf] ## $name ## ## ## $transform ## ## ## $inverse ## ## ## $d_transform ## NULL ## ## $d_inverse ## NULL ## ## $breaks ## extended_breaks() ## ## $minor_breaks ## regular_minor_breaks() ## ## $format ## format_format() ## ## $domain ## c(-Inf, Inf) Regardless of which method you use, the transformation occurs before any statistical summaries. To transform after statistical computation use coord_trans() From the docs: Example where stat transformation matters: ## x ymin ymax ymin_final ymax_final ## 1 1 12 28 12 28 ## 2 2 22 33 17 44 ## 3 3 15 26 15 26 ## x ymin ymax ymin_final ymax_final ## 1 1 1.079181 1.447158 1.079181 1.447158 ## 2 2 1.361728 1.531479 1.230449 1.643453 ## 3 3 1.176091 1.414973 1.176091 1.414973 ## x ymin ymax ymin_final ymax_final ## 1 1 12 28 12 28 ## 2 2 22 33 17 44 ## 3 3 15 26 15 26 9.2.10 ASIDE - A little more on transformations transform() method of the Scales ggproto: transform() Transforms a vector of values using self$trans. This occurs before the Stat is calculated. Transformation changes the layer data ## const up txt big log ## 1 1 1 a 1000 2 ## 2 1 2 b 2000 5 ## 3 1 3 c 3000 10 ## 4 1 4 d 4000 2000 ## x y PANEL group shape colour size fill alpha stroke ## 1 -1000 1 1 1 19 black 1.5 NA NA 0.5 ## 2 -2000 2 1 2 19 black 1.5 NA NA 0.5 ## 3 -3000 3 1 3 19 black 1.5 NA NA 0.5 ## 4 -4000 4 1 4 19 black 1.5 NA NA 0.5 ## function () ## { ## new_transform("reverse", function(x) -x, function(x) -x, ## d_transform = function(x) rep(-1, length(x)), d_inverse = function(x) rep(-1, ## length(x)), minor_breaks = regular_minor_breaks(reverse = TRUE)) ## } ## <bytecode: 0x5567abae6290> ## <environment: namespace:scales> ## List of 9 ## $ name : chr "reverse" ## $ transform :function (x) ## $ inverse :function (x) ## $ d_transform :function (x) ## $ d_inverse :function (x) ## $ breaks :function (x, n = n_default) ## $ minor_breaks:function (b, limits, n) ## $ format :function (x) ## $ domain : num [1:2] -Inf Inf ## - attr(*, "class")= chr "transform" ## [1] -1000 -2000 -3000 -4000 ## [1] 1000 2000 3000 4000 ## [1] "1000" "2000" "3000" "4000" Most useful for positioning purposes (ex: time_trans()) ## [1] 953553600 953557200 953560800 953564400 953568000 953571600 953575200 ## [8] 953578800 953582400 953586000 ## [1] "2000-03-20 12:00:00 UTC" "2000-03-20 13:00:00 UTC" ## [3] "2000-03-20 14:00:00 UTC" "2000-03-20 15:00:00 UTC" ## [5] "2000-03-20 16:00:00 UTC" "2000-03-20 17:00:00 UTC" ## [7] "2000-03-20 18:00:00 UTC" "2000-03-20 19:00:00 UTC" ## [9] "2000-03-20 20:00:00 UTC" "2000-03-20 21:00:00 UTC" ## [1] "12:00" "15:00" "18:00" "21:00" ## x y PANEL group shape colour size fill alpha stroke ## 1 953553600 0 1 -1 19 black 1.5 NA NA 0.5 ## 2 953557200 0 1 -1 19 black 1.5 NA NA 0.5 ## 3 953560800 0 1 -1 19 black 1.5 NA NA 0.5 ## 4 953564400 0 1 -1 19 black 1.5 NA NA 0.5 ## 5 953568000 0 1 -1 19 black 1.5 NA NA 0.5 ## 6 953571600 0 1 -1 19 black 1.5 NA NA 0.5 ## 7 953575200 0 1 -1 19 black 1.5 NA NA 0.5 ## 8 953578800 0 1 -1 19 black 1.5 NA NA 0.5 ## 9 953582400 0 1 -1 19 black 1.5 NA NA 0.5 ## 10 953586000 0 1 -1 19 black 1.5 NA NA 0.5 "],["date-time.html", "9.3 10.2 Date-time", " 9.3 10.2 Date-time 9.3.1 10.2.1 Breaks Book example: Making it explicit: Book example: ## [1] "1900-01-01" "1925-01-01" "1950-01-01" "1975-01-01" "2000-01-01" Using offset argument (unit = days): ## [1] "1900-02-01" "1925-02-01" "1950-02-01" "1975-02-01" "2000-02-01" Calculating the offset: ## Time difference of 31 days 9.3.2 10.2.2 Minor breaks Book examples: In the second plot, the major and minor beaks follow slightly different patterns: the minor breaks are always spaced 7 days apart but the major breaks are 1 month apart. Because the months vary in length, this leads to slightly uneven spacing. Explicit: 9.3.3 10.2.3 Labels Book examples: "],["discrete.html", "9.4 10.3 Discrete", " 9.4 10.3 Discrete Book examples: 9.4.1 10.3.1 Limits For discrete scales, limits should be a character vector that enumerates all possible values. Censors missing categories in the set: Adds new categories without value: Same effect with drop = FALSE with unused factor levels It drops unused factor levels by default, though 9.4.2 10.3.2 Scale labels 9.4.3 10.3.2 Scale labels Book example: Debugging strategy 9.4.4 10.3.3 guide_axis() Book examples: More guides in {ggh4x} - https://teunbrand.github.io/ggh4x/ "],["binned.html", "9.5 10.4 Binned", " 9.5 10.4 Binned Book example: "],["aside---geom_sf-limits.html", "9.6 ASIDE - geom_sf() + limits", " 9.6 ASIDE - geom_sf() + limits 9.6.1 Example from Twitter: https://twitter.com/Josh_Ebner/status/1470818469801299970?s=20 9.6.2 Reprexes from Ryan S: ## # A tibble: 6 × 2 ## x_coord y_coord ## <dbl> <dbl> ## 1 1 1 ## 2 1 2 ## 3 2 1 ## 4 3 2 ## 5 6 5 ## 6 1 1 Full range polygon Polygon with limits Path with limits geom_sf() without limits geom_sf() with limits 9.6.3 Further exploration Using geom_sf() adds CoordSF by default ## [1] "CoordSf" "CoordCartesian" "Coord" "ggproto" ## [5] "gg" ## [1] "CoordSf" "CoordCartesian" "Coord" "ggproto" ## [5] "gg" In fact, geom_sf() must be used with coord_sf() ## Error in `geom_sf()`: ## ! Problem while converting geom to grob. ## ℹ Error occurred in the 1st layer. ## Caused by error in `draw_panel()`: ## ! `geom_sf()` can only be used with `coord_sf()`. The underlying geometry is untouched (indicating that limits are not removing data) ## geometry PANEL group xmin xmax ymin ymax linetype alpha ## 1 POLYGON ((1 1, 1 2, 2 1, 3 ... 1 -1 1 6 1 5 1 NA ## stroke ## 1 0.5 ## geometry PANEL group xmin xmax ymin ymax linetype alpha ## 1 POLYGON ((1 1, 1 2, 2 1, 3 ... 1 -1 1 NA 1 5 1 NA ## stroke ## 1 0.5 ## [1] TRUE OOB handling inside scale_x|y_continuous() cannot override the behavior Instead, coord_sf(lims_method = ) offers other spatial-specific methods. Censor doesn’t seem to be one but an option like \"geometry_bbox\" automatically sets limits to the smallest bounding box that contain all geometries. Interesting note from the docs: … specifying limits via position scales or xlim()/ylim() is strongly discouraged, as it can result in data points being dropped from the plot even though they would be visible in the final plot region. 9.6.4 Internals Scale censor for geom_polygon() Scale censor for geom_sf() Inspecting the rendered geom with layer_grob() ## # A tibble: 6 × 2 ## x y ## <simplUnt> <simplUnt> ## 1 0.04545455native 0.04545455native ## 2 0.04545455native 0.2727273native ## 3 0.2272727native 0.04545455native ## 4 0.4090909native 0.2727273native ## 5 0.9545455native 0.9545455native ## 6 0.04545455native 0.04545455native ## # A tibble: 6 × 2 ## x y ## <simplUnt> <simplUnt> ## 1 0.04545455native 0.04545455native ## 2 0.04545455native 0.2727273native ## 3 0.3484848native 0.04545455native ## 4 0.6515152native 0.2727273native ## 5 1.560606native 0.9545455native ## 6 0.04545455native 0.04545455native "],["meeting-videos-9.html", "9.7 Meeting Videos", " 9.7 Meeting Videos 9.7.1 Cohort 1 Meeting chat log 00:59:06 June Choe: There's also a nice animation from wikipedia (the cylinder is squished because of perceptual inequality between hues) - https://upload.wikimedia.org/wikipedia/commons/transcoded/8/8d/SRGB_gamut_within_CIELCHuv_color_space_mesh.webm/SRGB_gamut_within_CIELCHuv_color_space_mesh.webm.480p.vp9.webm "],["colour-scales-and-legends.html", "Chapter 10 Colour Scales and Legends", " Chapter 10 Colour Scales and Legends Learning Objectives Learn how to map values to colours in ggplot2 Learn about colour theory (a more detailed exposition is available online at http://tinyurl.com/clrdtls) "],["a-little-colour-theory.html", "10.1 A little colour theory", " 10.1 A little colour theory There have been many attempts to come up with colours spaces that are more perceptually uniform. We’ll use a modern attempt called the HCL colour space, which has three components of hue, chroma and luminance: -Hue ranges from 0 to 360 (an angle) and gives the “colour” of the colour (blue, red, orange, etc). -Chroma is the “purity” of a colour, ranging from 0 (grey) to a maximum that varies with luminance. -Luminance is the lightness of the colour, ranging from 0 (black) to 1 (white). An additional complication is that many people (~10% of men) do not possess the normal complement of colour receptors and so can distinguish fewer colours than usual. In brief, it’s best to avoid red-green contrasts, and to check your plots with systems that simulate colour blindness. Visicheck (https://www.vischeck.com/vischeck/) is one online solution. Another alternative is the dichromat package34 which provides tools for simulating colour blindness, and a set of colour schemes known to work well for colour-blind people. You can also help people with colour blindness in the same way that you can help people with black-and-white printers: by providing redundant mappings to other aesthetics like size, line type or shape. 10.1.1 Colour blindness "],["continuous-colour-scales.html", "10.2 Continuous colour scales", " 10.2 Continuous colour scales Colour gradients are often used to show the height of a 2d surface. The plots in this section use the surface of a 2d density estimate of the faithful dataset which records the waiting time between eruptions and during each eruption for the Old Faithful geyser in Yellowstone Park. Any time I refer to scale_fill_() in this section there is a corresponding scale_colour_() for the colour aesthetic (or scale_color_*() if you prefer US spelling). 10.2.1 Particular pallettes There are multiple ways to specify continuous colour scales. You can use to construct your own palette, but it is unnecessary because there are many “hand picked” palettes available. Ggplot2 supplies two scale functions that bundle pre-specified palettes, scale_fill_viridis_c() and scale_fill_distiller(). The viridis scales are designed to be perceptually uniform in both colour and when reduced to black and white, and to be perceptible to people with various forms of colour blindness. The second group of continuous colour scales built in to ggplot2 are derived from the ColorBrewer scales: scale_fill_brewer() provides these colours as discrete palettes, while scale_fill_distiller() and scale_fill_fermenter() are the continuous and binned analogs. scale_fill_scico() provides palettes that are perceptually uniform and suitable for scientific visualisation A particularly useful package is paletteer which aims to provide a common interface. 10.2.2 Robust recipes The default scale for continuous fill scales is scale_fill_continuous() which in turn defaults to scale_fill_gradient(). As a consequence, these three commands produce the same plot using a gradient scale. Gradient scales provide a robust method for creating any colour scheme you like. You just specify two or more reference colours, and ggplot2 will interpolate linearly between them. Three functions that you can use for this purpose are *scale_fill_gradient() produces a two-colour gradient *scale_fill_gradient2() produces a three-colour gradient with specified midpoint *scale_fill_gradientn() produces an n-colour gradient The Munsell colour system provides an easy way of specifying colours based on their hue, chroma and luminance. The munsell package provides easy access to the Munsell colours, which can then be used to specify a gradient scale. For more information on the munsell package see https://github.com/cwickham/munsell/. Three-point gradient scales typically convey the perceptual impression that there is a natural midpoint (often a zero value) from which the other values diverge. The left plot below shows how to create a divergent “yellow/blue” scale. If you have colours that are meaningful for your data (e.g., black body colours or standard terrain colours), or you’d like to use a palette produced by another package, you may wish to use an n-point gradient. The middle and right plots below use the colorspace package. For more information on the colorspace package see https://colorspace.r-forge.r-project.org/. 10.2.3 Missing values All continuous colour scales have an na.value parameter that controls what colour is used for missing values (including values outside the range of the scale limits). By default it is set to grey, which will stand out when you use a colourful scale. If you use a black and white scale, you might want to set it to something else to make it more obvious. You can set na.value = NA to make missing values invisible, or choose a specific colour if you prefer: 10.2.4 Limits, breaks and labels You can suppress the breaks entirely by setting them to NULL. For axes, this removes the tick marks, grid lines, and labels; and for legends this removes the keys and labels. 10.2.5 Legends "],["discrete-colour-scales.html", "10.3 Discrete colour scales", " 10.3 Discrete colour scales Discrete colour and fill scales occur in many situations. A typical example is a barchart that encodes both position and fill to the same variable. The default scale for discrete colours is scale_fill_discrete() which in turn defaults to scale_fill_hue() so these are identical plots: 10.3.1 Brewer scales scale_colour_brewer() is a discrete colour scale that—along with the continuous analog scale_colour_distiller() and binned analog scale_colour_fermenter()—uses handpicked “ColorBrewer” colours taken from http://colorbrewer2.org/. These colours have been designed to work well in a wide variety of situations, although the focus is on maps and so the colours tend to work better when displayed in large areas. There are many different options: The first group of palettes are sequential scales that are useful when your discrete scale is ordered (e.g., rank data), and are available for continuous data using scale_colour_distiller(). For unordered categorical data, the palettes of most interest are those in the second group. ‘Set1’ and ‘Dark2’ are particularly good for points, and ‘Set2’, ‘Pastel1’, ‘Pastel2’ and ‘Accent’ work well for areas. Note that no palette is uniformly good for all purposes. Scatter plots typically use small plot markers, and bright colours tend to work better than subtle ones: Bar plots usually contain large patches of colour, and bright colours can be overwhelming. Subtle colours tend to work better in this situation: 10.3.2 Hue and grey scales The default colour scheme picks evenly spaced hues around the HCL colour wheel. This works well for up to about eight colours, but after that it becomes hard to tell the different colours apart. You can control the default chroma and luminance, and the range of hues, with the h, c and l arguments: One disadvantage of the default colour scheme is that because the colours all have the same luminance and chroma, when you print them in black and white, they all appear as an identical shade of grey. Noting this, if you are intending a discrete colour scale to be printed in black and white, it is better to use scale_fill_grey() which maps discrete data to grays, from light to dark: 10.3.3 Paleteer Scales 10.3.4 Manual scales If none of the hand-picked palettes is suitable, or if you have your own preferred colours, you can use scale_fill_manual() to set the colours manually. This can be useful if you wish to choose colours that highlight a secondary grouping structure or draw attention to different comparisons: You can also use a named vector to specify colors to be assigned to each level which allows you to specify the levels in any order you like: 10.3.5 Limits, breaks and labels 10.3.6 Legends "],["binned-colour-scales.html", "10.4 Binned colour scales", " 10.4 Binned colour scales Color scales also come in binned versions. The default scale is scale_fill_binned() which in turn defaults to scale_fill_steps(). These scales have an n.breaks argument that controls the number of discrete colour categories created by the scale. Counterintuitively—because the human visual system is very good at detecting edges—this can sometimes make a continuous colour gradient easier to perceive: In other respects scale_fill_steps() is analogous to scale_fill_gradient(), and allows you to construct your own two-colour gradients. There is also a three-colour variant scale_fill_steps2() and n-colour scale variant scale_fill_stepsn() that behave similarly to their continuous counterparts: A brewer analog for binned scales also exists, and is called scale_fill_fermenter(): Note that like the discrete scale_fill_brewer()—and unlike the continuous scale_fill_distiller()—the binned function scale_fill_fermenter() does not interpolate between the brewer colours, and if you set n.breaks larger than the number of colours in the palette a warning message will appear and some colours will not be displayed. 10.4.1 Limits, breaks and labels 10.4.2 Legends "],["date-time-colour-scales.html", "10.5 Date Time Colour Scales", " 10.5 Date Time Colour Scales When a colour aesthetic is mapped to a date/time type, ggplot2 uses scale_colour_date() or scale_colour_datetime() to specify the scale. These are designed to handle date data, analogous to the date scales discussed in Section 10.2. These scales have date_breaks and date_labels arguments that make it a little easier to work with these data, as the slightly contrived example below illustrates: "],["alpha-scales.html", "10.6 Alpha scales", " 10.6 Alpha scales Alpha scales map the transparency of a shade to a value in the data and can be a convenient way to visually down-weight less important observations. scale_alpha() is an alias for scale_alpha_continuous() since that is the most common use of alpha, and it saves a bit of typing. "],["legend-position.html", "10.7 Legend position", " 10.7 Legend position A number of settings that affect the overall display of the legends are controlled through the theme system. You’ll learn more about that in Section 18.2, but for now, all you need to know is that you modify theme settings with the theme() function. The position and justification of legends are controlled by the theme setting legend.position, which takes values “right”, “left”, “top”, “bottom”, or “none” (no legend). Switching between left/right and top/bottom modifies how the keys in each legend are laid out (horizontal or vertically), and how multiple legends are stacked (horizontal or vertically). If needed, you can adjust those options independently: legend.direction: layout of items in legends (“horizontal” or “vertical”). legend.box: arrangement of multiple legends (“horizontal” or “vertical”). legend.box.just: justification of each legend within the overall bounding box, when there are multiple legends (“top”, “bottom”, “left”, or “right”). Alternatively, if there’s a lot of blank space in your plot you might want to place the legend inside the plot by setting legend.position to a numeric vector of length two. The numbers represent a relative location in the panel area: c(0, 1) is the top-left corner and c(1, 0) is the bottom-right corner. You control which corner of the legend the legend.position refers to with legend.justification, which is specified in a similar way. Unfortunately positioning the legend exactly where you want it requires a lot of trial and error. "],["meeting-videos-10.html", "10.8 Meeting Videos", " 10.8 Meeting Videos 10.8.1 Cohort 1 Meeting chat log 00:59:06 June Choe: There's also a nice animation from wikipedia (the cylinder is squished because of perceptual inequality between hues) - https://upload.wikimedia.org/wikipedia/commons/transcoded/8/8d/SRGB_gamut_within_CIELCHuv_color_space_mesh.webm/SRGB_gamut_within_CIELCHuv_color_space_mesh.webm.480p.vp9.webm Meeting chat log 00:12:21 June Choe: BTW as of April 2021 v0.6.0 {viridis} got 3 more color palettes -- mako, rocket, and turbo --- https://sjmgarnier.github.io/viridis/articles/intro-to-viridis.html 00:20:41 June Choe: "for legends this removes the keys and labels" i guess? 00:22:34 June Choe: scale_fill_hue in turn uses scales::hue_pal(), if you want to use the default discrete color palette - https://scales.r-lib.org/reference/hue_pal.html 00:30:43 Federica Gazzelloni: really like this one: https://colorspace.r-forge.r-project.org/ 00:35:02 Michael Haugen: https://github.com/rfordatascience/tidytuesday 00:35:56 Michael Haugen: When I have accessed data from TT I have usually read them in manually. 00:36:04 Michael Haugen: for example: starbucks <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-12-21/starbucks.csv') 00:46:06 Ryan Metcalf: PodCast Page: https://www.tidytuesday.com/ 00:50:16 Michael Haugen: Ryan Almost took down TidyTuesday 00:50:46 Michael Haugen: Make sure to commit to main 00:50:52 Ryan S: lol "],["other-aesthetics.html", "Chapter 11 Other Aesthetics", " Chapter 11 Other Aesthetics Learning objectives: To learn about several other aesthetics that ggplot2 can use to represent data, including: size scales shape scales line type scales manual scales identity scales "],["size.html", "11.1 Size", " 11.1 Size The size aesthetic is typically used to scale points and text. The default scale for size aesthetics is scale_size() in which a linear increase in the variable is mapped onto a linear increase in the area (not the radius) of the geom. There are several size scales: scale_size_area() and scale_size_binned_area() are versions of scale_size() and scale_size_binned() that ensure that a value of 0 maps to an area of 0. scale_radius() maps the data value to the radius rather than to the area (Section 12.1.1). scale_size_binned() is a size scale that behaves like scale_size() but maps continuous values onto discrete size categories, analogous to the binned position and colour scales discussed in Sections 10.4 and 11.4 respectively. Legends associated with this scale are discussed in Section 12.1.2. scale_size_date() and scale_size_datetime() are designed to handle date data, analogous to the date scales discussed in Section 10.2. 11.1.1 Radius size scales There are situations where area scaling is undesirable, and for such situations scale_radius() may be more appropriate. For example, consider a data set containing astronomical data that includes the radius of different planets: ## name type position radius orbit ## 1 Mercury Inner 1 2440 57900000 ## 2 Venus Inner 2 6052 108200000 ## 3 Earth Inner 3 6378 149600000 ## 4 Mars Inner 4 3390 227900000 ## 5 Jupiter Outer 5 71400 778300000 ## 6 Saturn Outer 6 60330 1427000000 ## 7 Uranus Outer 7 25559 2871000000 ## 8 Neptune Outer 8 24764 4497100000 11.1.2 Binned size scales Binned size scales work similarly to binned scales for colour and position aesthetics (Sections 11.4 and 10.4) with the exception of how legends are displayed. The default legend for a binned size scale, and all binned scales except position and colour aesthetics, is governed by guide_bins(). For instance, in the mpg data we could use scale_size_binned() to create a binned version of the continuous variable hwy: Unlike guide_legend(), the guide created for a binned scale by guide_bins() does not organize the individual keys into a table. Instead they are arranged in a column (or row) along a single vertical (or horizontal) axis, which by default is displayed with its own axis. The important arguments to guide_bins() are listed below: axis indicates whether the axis should be drawn (default is TRUE) direction is a character string specifying the direction of the guide, either “vertical” (the default) or “horizontal” show.limits specifies whether tick marks are shown at the ends of the guide axis (default is FALSE) axis.colour, axis.linewidth and axis.arrow are used to control the guide axis that is displayed alongside the legend keys keywidth, keyheight, reverse and override.aes have the same behavior for guide_bins() as they do for guide_legend() (see Section 11.3.6) "],["shape.html", "11.2 Shape", " 11.2 Shape Values can be mapped to the shape aesthetic, most typically when you have a small number of discrete categories. Note: if the data variable contains more than 6 values it becomes difficult to distinguish between shapes, and will produce a warning. Although any one plot is unlikely to be readable with more than a 6 distinct markers, there are 25 possible shapes to choose from. The default scale_shape() function contains a single argument: set solid = TRUE (the default) to use a “palette” consisting of three solid shapes and three hollow shapes, or set solid = FALSE to use six hollow shapes: You can specify the marker types for each data value manually using scale_shape_manual(). For more information about manual scales see Section 12.4. "],["line-type.html", "11.3 Line type", " 11.3 Line type It is possible to map a variable onto the linetype aesthetic, which works best for discrete variables with a small number of categories, where scale_linetype() is an alias for scale_linetype_discrete(). Continuous variables cannot be mapped to line types unless scale_linetype_binned() is used: although there is a scale_linetype_continuous() function, all it does is produce an error. With five categories the above plot is quite difficult to read. The default “palette” for linetype is supplied by the scales::linetype_pal() function, and includes the 13 linetypes shown below: You can control the line type by specifying a string with up to 8 hexadecimal values. In this specification, -the first value is the length of the first line segment, the second value is the length of the first space between segments, and so on. This allows you to specify your own line types using scale_linetype_manual(), or alternatively, by passing a custom function to the palette argument. Note that the last four lines are blank, because the linetypes() function defined above returns NA when the number of categories exceeds 9. The scale_linetype() function contains a na.value argument used to specify what kind of line is plotted for these values. By default this produces a blank line, but you can override this by setting na.value = “dotted”: Valid line types can be set using a human readable character string: “blank”, “solid”, “dashed”, “dotted”, “dotdash”, “longdash”, and “twodash” are all understood. "],["manual-scales-1.html", "11.4 Manual scales", " 11.4 Manual scales Manual scales are just a list of valid values that are mapped to the unique discrete values. If you want to customize these scales, you need to create your own new scale with the “manual” version of each: scale_linetype_manual(), scale_shape_manual(), scale_colour_manual(), etc. The manual scale has one important argument, values, where you specify the values that the scale should produce if this vector is named, it will match the values of the output to the values of the input; otherwise it will match in order of the levels of the discrete variable. You will need some knowledge of the valid aesthetic values, which are described in vignette(“ggplot2-specs”). Manual scales have appeared earlier, in Sections 11.3.4 and 12.2. In the following example, you’ll see a creative use of scale_colour_manual() to display multiple variables on the same plot and show a useful legend. -In most plotting systems, you’d color the lines and then add a legend: That doesn’t work in ggplot because there’s no way to add a legend manually. Instead, give the lines informative labels: And then tell the scale how to map labels to colours: "],["identity-scales.html", "11.5 Identity Scales", " 11.5 Identity Scales Identity scales — such as scale_colour_identity() and scale_shape_identity() — are used when your data is already scaled such that the data and aesthetic spaces are the same. The code below shows an example where the identity scale is useful. luv_colours contains the locations of all R’s built-in colours in the LUV colour space (the space that HCL is based on). ## L u v col ## 1 9341.570 -3.370649e-12 0.0000 white ## 2 9100.962 -4.749170e+02 -635.3502 aliceblue ## 3 8809.518 1.008865e+03 1668.0042 antiquewhite ## 4 8935.225 1.065698e+03 1674.5948 antiquewhite1 ## 5 8452.499 1.014911e+03 1609.5923 antiquewhite2 ## 6 7498.378 9.029892e+02 1401.7026 antiquewhite3 "],["meeting-videos-11.html", "11.6 Meeting Videos", " 11.6 Meeting Videos 11.6.1 Cohort 1 Meeting chat log 00:22:22 Federica Gazzelloni: that’s very useful 00:23:08 Michael Haugen: Arrows! 00:31:57 Ryan Metcalf: https://ggplot2-book.org/scale-other.html#scale-manual 00:39:42 Federica Gazzelloni: where do you put the question mark? 00:39:49 Ryan Metcalf: It may only be me…I always forget how to pull installed datasets in R. If you run `data()` it will list all installed datasets. 00:39:55 Federica Gazzelloni: before the function' 00:40:03 Federica Gazzelloni: ?.. 00:40:17 Federica Gazzelloni: to have help information 00:40:28 Ryan Metcalf: I put it on the front: `?LakeHuron` 00:42:21 June Choe: BTW a tangent but something I just learned recently about the help syntax: `?` will exact match and `??` will regex match. So `?LakeHuron` and `??keHuro` also works (with the latter being a bit slower) 00:43:00 Ryan S: @June -- Whoa, that's cool 00:43:43 Ryan S: If anyone cares, here is the code that does the LakeHuron data WITH an automatic legend 00:43:45 Ryan S: data.frame(year = 1875:1972, level = as.numeric(LakeHuron)) %>% mutate(above = level + 5, below = level -5) %>% pivot_longer(cols = c("above", "below"), values_to = "new_level", names_to = "level_set") %>% ggplot(aes(x = year, y = new_level, groups = level_set, color = level_set)) + geom_line() 00:44:09 Federica Gazzelloni: cool 00:44:12 Ryan Metcalf: Awesome comment June! That would explain why I get “unexpected” behavior….I wasn’t sure of the differences. Thanks for clarifying! 00:44:22 June Choe: A more on-topic regex-y example of ?? would be like `??scale_.*_manual` 00:44:25 Ryan S: don't forget the groups = level_set…. 00:45:05 Federica Gazzelloni: 0.2 near the minimu 00:45:21 Federica Gazzelloni: scale alpha is 0 to 1 00:46:15 Federica Gazzelloni: thanks! 00:46:34 Ryan S: thanks, Lydia! 00:46:36 Federica Gazzelloni: they are all very useful features 00:47:56 June Choe: maybe we can do a week of tidytuesday session if many people of us are interested too! 00:48:06 Michael Haugen: ^^ 00:48:13 priyanka gagneja: sure 00:48:30 Michael Haugen: I like that; devote one week on a Tidy Tuesday and not a chapter. 00:48:32 Federica Gazzelloni: would love that @june 00:49:54 June Choe: I have an old (static) tidytuesday submission done in D3 if you want to peak at the code - https://observablehq.com/@yjunechoe/tidytuesday-2021-22 00:50:16 June Choe: (but agree with everything Ryan M's saying about how complex it is + pretty big learning curve) 00:51:49 June Choe: there's base svg renderer and also {svglite} which is developed by Rstudio https://svglite.r-lib.org/ 00:55:03 Michael Haugen: D3PO 00:55:12 June Choe: I have an example of r2d3 rendered in Rmarkdown with D3 code edited in RStudio - https://gist.github.com/yjunechoe/074e0020841fec3009b239583f305adc 00:55:40 June Choe: (Rstudio has javascript syntax highlight support so writing D3 wasn't too weird) 00:56:47 Michael Haugen: Does Shiny replace the need for D3 or is that apples and organges? 00:57:41 June Choe: IMO shiny is bulkier because it requires an R server backend but D3/JS can entirely be server-side (all calculations happen inside the user's browser) 00:57:52 June Choe: oops client-side* 00:57:58 Michael Haugen: thanks June. Makes sense. 00:58:19 Lydia Gibson: Off topic: I believe they will be removing the examples from the Ggplot2 book in the third edition. 00:59:08 Federica Gazzelloni: of course you can use it for scraping tables 00:59:21 Ryan S: June -- if you design your application for client-side calculations, I assume you have to optimize it so that it doesn't clog up the user's computer? 00:59:40 Ryan Metcalf: R2D3 Package Link: https://rstudio.github.io/r2d3/ 00:59:52 Ryan S: example -- you don't want to throw a million records at the client-side just because your server side can handle it? 00:59:53 June Choe: @Ryan S yaaa and i don't have much experience in that but thats a big topic 01:00:02 June Choe: Thank you! 01:00:18 Ryan S: I'll try it using Ryan M's client side. :-) 01:01:05 Federica Gazzelloni: that would be great! 01:01:10 Ryan Metcalf: Slack channel for Data Visualization Society: Datavizsociety.slack.com 01:01:16 Lydia Gibson: Yes please! 01:01:35 Ryan Metcalf: Finally, Pandoc link: https://pandoc.org/ 01:01:55 June Choe: didn't know about that slack - cool! "],["build-a-plot-layer-by-layer.html", "Chapter 12 Build a plot layer by layer", " Chapter 12 Build a plot layer by layer Learning objectives: Understanding ggplot layers How to control layers Application to real data "],["building-a-plot.html", "12.1 Building a plot", " 12.1 Building a plot In this chapter we talk about the grammar of graphics plots and their construction layer by layer. We use data from the {SpatialEpi} package: Let’s check what data is inside the package, we can use the NYleukemia which contains observations about leukemia cases in NY, as well as providing other information about population and spatials such as latidude and logitude where the cases were located. ## censustract.FIPS cases population ## 1 36007000100 3.08284 3540 ## 2 36007000200 4.08331 3560 ## 3 36007000300 1.08750 3739 ## censustract.FIPS x y ## 1 36007000100 -75.94087 42.10782 ## 2 36007000200 -75.93118 42.11099 ## 3 36007000300 -75.92011 42.11738 Let’s now make a first layer visualization using ggplot2 The second layer of our plot would take consideration of the geoms In general when we make a ggplot, we build the plot without thinking about the layers, but what is happening inside the hood when we add a layer? The layer() function is called for combining data, stat and geom. Layers are created using geom_* or stat_* calls or directly using the function: layer( geom = NULL, stat = NULL, data = NULL, mapping = NULL, position = NULL, params = list(), inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE, show.legend = NA, key_glyph = NULL, layer_class = Layer ) To obtain the same results: layer() function components: mapping data geom stat position … "],["data.html", "12.2 Data", " 12.2 Data The layers of your plot can be populated with different datasets. Here we generate two new datasets from the df dataset. What geom_smooth() does behind the scenes? fit a model, in this case a loess model generate prediction, about the trend of the data In this example we create a grid of length of 50 to have an average trend to show in a secondary layer of the plot. ## # A tibble: 3 × 2 ## population cases ## <dbl> <dbl> ## 1 9 0.194 ## 2 274. 0.273 ## 3 540. 0.369 ## [1] 50 2 ## [1] 281 5 Next step would be to isolate the outliers (observations far away from predicted values), with the help of the resid() function to extract model residuals ## Call: ## loess(formula = cases ~ population, data = df) ## ## Number of Observations: 281 ## Equivalent Number of Parameters: 5.33 ## Residual Standard Error: 1.769 ## Trace of smoother matrix: 5.84 (exact) ## ## Control settings: ## span : 0.75 ## degree : 2 ## family : gaussian ## surface : interpolate cell = 0.2 ## normalize: TRUE ## parametric: FALSE ## drop.square: FALSE And build the residuals std error vector: ## censustract.FIPS cases population x y ## 1 36007012500 7.13834 5911 -75.69563 42.06164 ## 2 36007013000 7.11907 5088 -76.00001 42.12407 ## 3 36007013302 0.19008 8122 -76.06948 42.13547 ## [1] 16 5 Add a new layer with different data: grid 12.2.1 Exercises Recreate the plot in the book "],["aesthetic-mappings.html", "12.3 Aesthetic mappings", " 12.3 Aesthetic mappings The aesthetics: aes() allows for some omissions, under certain conditions. The complete syntax would be: ggplot( data = ..., mapping = aes(x = ..., y = ..., ...)) In general x = and y = inside the aes(x = ..., y = ..., ...) can be omitted. Sometimes R asks you about the missing mapping, and this is when more than one layer with different datasets is used. To solve the issue would be enough to add all the specifications inside the aesthetics. One more interesting thing to mention is: What manipulation happens when complex tranformations are set inside the aes()? As an example , if we apply the log transformation: (the example is from the diamond dataset) aes(log(carat), log(price)) What happens behind the scenes is an explicit call to dplyr::mutate() (The symbol $ is not allowed inside the aes()) 12.3.1 Specifying the aesthetics in the plot vs. in the layers All of these alternatives are allowed: ggplot(mpg, aes(displ, hwy, colour = class)) + geom_point() ggplot(mpg, aes(displ, hwy)) + geom_point(aes(colour = class)) ggplot(mpg, aes(displ)) + geom_point(aes(y = hwy, colour = class)) ggplot(mpg) + geom_point(aes(displ, hwy, colour = class)) But under some conditions, such as the use of a geom_smooth(), the position of secondary arguments need to be specified in the layer parameters, as it is important for releasing correct results. In the first case the smooth line doesn’t show up. 12.3.2 Setting vs. mapping What is the difference between mapping and setting an aesthetic? To map an aesthetic to a variable there are different options, you can put the color argument (or other secondary arguments) inside or outside the aesthetic with different results: geom_...(aes(colour = cut)) geom_...(colour="red") Or set an aesthetic to a constant, a specific color-value, in case of a color argument: ...,colour = "red") An alternative would be to use the function: scale_colour_identity() In case of more than one geom_smooth() being used in the plot, the different colors can be specified with scale_color_...() function. "],["geoms-1.html", "12.4 Geoms", " 12.4 Geoms geoms stands for geometric objects for short. Some geoms requires both x and y while others not, as well as other require more than simply x and y, such as xmax, ymax etc. If you do geom_ and tab all the available geoms appear in a list for you to choose from. As an example here we use the geom_quantile() to represent a smoothed quantile regression and the geom_rug() for maginal rugs. 12.4.1 Exercises Discussion The book suggests to download the cheatsheets: ggplot2 cheatsheet (Ex.5) Display how a variable has changed over time: source ## # A tibble: 3 × 6 ## date pce pop psavert uempmed unemploy ## <date> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 1967-07-01 507. 198712 12.6 4.5 2944 ## 2 1967-08-01 510. 198911 12.6 4.7 2945 ## 3 1967-09-01 516. 199113 11.9 4.6 2958 Show the detailed distribution of a single variable The distribution can be described using a frequency table and histogram. Focus attention on the overall trend in a large dataset Interesting resource Draw a map Label outlying points "],["stats.html", "12.5 Stats", " 12.5 Stats There are several stat_...() functions used to transform the data by summarizing information. For example the stat_ecdf() compute the empirical cumulative distribution plot Here we use stat_summary() function for *categorical data** 12.5.1 Generated variables from the stat_...() functions stat takes a data frame as input and returns a data frame as output. Here we use the diamonds dataset, to see hoe the after_stat() can be applied 12.5.2 Exercises What stats were used to create the Q-Q plot? What stats were used to create the Normal density? "],["position-adjustments.html", "12.6 Position adjustments", " 12.6 Position adjustments The position is very important for some geoms: position_nudge() position_jitter() position_jitterdodge() all of them can be used inside the geom: geom_count() "],["meeting-videos-12.html", "12.7 Meeting Videos", " 12.7 Meeting Videos 12.7.1 Cohort 1 Meeting chat log 00:08:06 June Choe: hey all! 00:08:13 Federica Gazzelloni: Hi!! 00:08:18 June Choe: thanks for moving the time to this hour 00:08:37 Federica Gazzelloni: That’s better for me either 00:08:43 Lydia Gibson: https://imstat.org/meetings-calendar/ims-international-conference-on-statistics-and-data-science-icsds/ 00:08:51 June Choe: (now i get to call in as I eat lunch at the student common space) 00:57:46 Kent Johnson: Thank you, this was an interesting chapter! 00:57:52 Michael Haugen: Thank you! 00:57:57 June Choe: thanks! 00:58:12 Ryan Metcalf: Thank you Federica! 00:58:23 Stan Piotrowski: Thanks for a great presentation! Meeting chat log 00:14:06 June Choe: gray is default iirc 00:23:25 June Choe: I wonder if something like this works with datetime values on x scale_x_date(date_breaks = "2 weeks", offset = 31) 00:23:54 June Choe: (or offset = -31, maybe) 00:25:30 June Choe: I see - I'll play around with it more ! 00:36:05 Federica Gazzelloni: rle {base}: Compute the lengths and values of runs of equal values in a vector – or the reverse operation. 00:36:27 Ryan Metcalf: Sorry team, I have to drop. Great job Kent! 00:48:15 Federica Gazzelloni: related with cumulative values 00:48:29 Priyanka Gagneja: Thanks Ryan. See you next time 00:50:34 June Choe: It's discussed in Advanced R book Ch. 10.2.4! https://adv-r.hadley.nz/function-factories.html?q=stateful#stateful-funs 00:50:43 Federica Gazzelloni: thanks! 00:52:04 Priyanka Gagneja: @June this in response the environment() ? 01:05:03 June Choe: I have the 2nd edition of R Graphics book from 2011 that has a chapter on ggplot2 back then and the code has not changed (i'll see if I can upload a page from that) 01:06:06 June Choe: they also changed some syntax from tidyr in the new update from like a few days ago 01:06:16 June Choe: (to make it easier for users especailly with respect to nest!) 01:07:39 June Choe: thanks! "],["scales-and-guides.html", "Chapter 13 Scales and Guides", " Chapter 13 Scales and Guides Learning objectives: Illustrate that there is nothing preventing you from transforming other kinds of scales beyond continuous position scale Show how concepts for position scales apply elsewhere Discuss the theory underpinning scales and guides "],["theory-of-scales-and-guides.html", "13.1 Theory of scales and guides", " 13.1 Theory of scales and guides Each scale is a function from a region in data space to a region in aesthetic space. The axis or legend is the inverse function, known as the guide: it allows you to convert visual properties back to data. Surprisingly, axes and legends are the same type of thing, but while they look very different they have the same purpose: to allow you to read observations from the plot and map them back to their original values. The commonalities between the two are illustrated below: Argument name Axis Legend name Label Title breaks Ticks & grid line Key labels Tick label Key label However, legends are more complicated than axes, and consequently there are a number of topics that are specific to legends: 1. A legend can display multiple aesthetics (e.g. colour and shape), from multiple layers (Section 15.7.1), and the symbol displayed in a legend varies based on the geom used in the layer (Section 15.8) 2. Axes always appear in the same place. Legends can appear in different places, so you need some global way of positioning them. (Section 11.7) 3. Legends have more details that can be tweaked: should they be displayed vertically or horizontally? How many columns? How big should the keys be? This is discussed in (Section 15.5) 13.1.1 Scale specification An important property of ggplot2 is the principle that every aesthetic in your plot is associated with exactly one scale. For instance, when you write this ggplot(mpg, aes(displ, hwy)) + geom_point(aes(colour = class)) ggplot2 adds a default scale for each aesthetic used in the plot: ggplot(mpg, aes(displ, hwy)) + geom_point(aes(colour = class)) + scale_x_continuous() + scale_y_continuous() + scale_colour_discrete() ggplot(mpg, aes(displ, hwy)) + geom_point(aes(colour = class)) + scale_x_continuous(name = "A really awesome x axis label") + scale_y_continuous(name = "An amazingly great y axis label") The use of + to “add” scales to a plot is a little misleading because if you supply two scales for the same aesthetic, the last scale takes precedence: ggplot(mpg, aes(displ, hwy)) + geom_point() + scale_x_continuous(name = "Label 1") + scale_x_continuous(name = "Label 2") #> Scale for 'x' is already present. Adding another scale for 'x', which will #> replace the existing scale. ggplot(mpg, aes(displ, hwy)) + geom_point() + scale_x_continuous(name = "Label 2") ggplot(mpg, aes(displ, hwy)) + geom_point(aes(colour = class)) + scale_x_sqrt() + scale_colour_brewer() 13.1.2 Naming scheme The scale functions intended for users all follow a common naming scheme. You’ve probably already figured out the scheme, but to be concrete, it’s made up of three pieces separated by “_“: 1. scale 2. The name of the primary aesthetic (e.g., colour, shape or x) 3. The name of the scale (e.g., continuous, discrete, brewer). 13.1.3 Fundamental scale types All scale functions in ggplot2 belong to one of three fundamental types: continuous scales, discrete scales, and binned scales. Each fundamental type is handled by one of three scale constructor functions: continuous_scale(), discrete_scale() and binned_scale(). Although you should never need to call these constructor functions, they provide the organizing structure for scales and it is useful to know about them. "],["scale-breaks.html", "13.2 Scale Breaks", " 13.2 Scale Breaks Discussion of what unifies the concept of breaks across continuous, discrete and binned scales: they are specific data values at which the guide needs to display something. Include additional detail about break functions. "],["scale-limits.html", "13.3 Scale Limits", " 13.3 Scale Limits Section 15.1 introduced the concept that a scale defines a mapping from the data space to the aesthetic space. Scale limits are an extension of this idea: they dictate the region of the data space over which the mapping is defined. For continuous and binned scales, the data space is inherently continuous and one-dimensional, so the limits can be specified by two end points. For discrete scales, however, the data space is unstructured and consists only of a set of categories: as such the limits for a discrete scale can only be specified by enumerating the set of categories over which the mapping is defined. The toolbox chapters outline the common practical goals for specifying the limits: for position scales the limits are used to set the end points of the axis, for example. This leads naturally to the question of what ggplot2 should do if the data set contains “out of bounds” values that fall outside the limits. The default behaviour in ggplot2 is to convert out of bounds values to NA. We can override this default by setting oob argument of the scale, a function that is applied to all observations outside the scale limits. The default is scales::oob_censor() which replaces any value outside the limits with NA. Another option is scales::oob_squish() which squishes all values into the range. An example using a fill scale is shown below: The first plot the default fill colours are shown, ranging from dark blue to light blue. In the second plot the scale limits for the fill aesthetic are reduced so that the values for the three rightmost bars are replace with NA and are mapped to a grey shade. In some cases this is desired behaviour but often it is not: the third plot addresses this by modifying the oob function appropriately. "],["scale-guides.html", "13.4 Scale guides", " 13.4 Scale guides Scale guides are more complex than scale names: where the name argument (and labs() ) takes text as input, the guide argument (and guides()) require a guide object created by a guide function such as guide_colourbar() and guide_legend(). These arguments to these functions offer additional fine control over the guide. The table below summarises the default guide functions associated with different scale types: Scale type Default guide type continuous scales for colour / fill aesthetics colourbar binned scales for colour/fill aesthetics coloursteps position scales (continuous, binned, and discrete) axis discrete scales (except position scales) legend binned scalesd (except position/colour/fill scales) bins Each of these guide types has appeared earlier in the toolbox: guide_colourbar() is discussed in Section 11.2.5 guide_coloursteps() is discussed in Section 11.4.2 guide_axis() is discussed in Section 10.3.2 guide_legend() is discussed in Section 11.3.6 guide_bins() is discussed in Section 12.1.2 In addition to the functionality discussed in those sections, the guide functions have many arguments that are equivalent to theme settings like text colour, size, font etc, but only apply to a single guide. For information about those settings, see Chapter 18. "],["scale-transformation.html", "13.5 Scale transformation", " 13.5 Scale transformation The most common use for scale transformations is to adjust a continuous position scale, as discussed in Section 10.1.7. However, they can sometimes be helpful to when applied to other aesthetics. Often this is purely a matter of visual emphasis. An example of this for the Old Faithful density plot is shown below. The linearly mapped scale on the left makes it easy to see the peaks of the distribution, whereas the transformed representation on the right makes it easier to see the regions of non-negligible density around those peaks: Transforming size aesthetics is also possible: In the plot on the left, the z value is naturally interpreted as a “weight”: if each dot corresponds to a group, the z value might be the size of the group. In the plot on the right, the size scale is reversed, and z is more naturally interpreted as a “distance” measure: distant entities are scaled to appear smaller in the plot. "],["legend-merging-and-splitting.html", "13.6 Legend merging and splitting", " 13.6 Legend merging and splitting There is always a one-to-one correspondence between position scales and axes. But the connection between non-position scales and legend is more complex: one legend may need to draw symbols from multiple layers (“merging”), or one aesthetic may need multiple legends (“splitting”). 13.6.1 Merging legends Merging legends occurs quite frequently when using ggplot2. For example, if you’ve mapped colour to both points and lines, the keys will show both points and lines. If you’ve mapped fill colour, you get a rectangle. Note the way the legend varies in the plots below: By default, a layer will only appear if the corresponding aesthetic is mapped to a variable with aes(). You can override whether or not a layer appears in the legend with show.legend: FALSE to prevent a layer from ever appearing in the legend; TRUE forces it to appear when it otherwise wouldn’t. Using TRUE can be useful in conjunction with the following trick to make points stand out: ggplot2 tries to use the fewest number of legends to accurately convey the aesthetics used in the plot. It does this by combining legends where the same variable is mapped to different aesthetics. The figure below shows how this works for points: if both colour and shape are mapped to the same variable, then only a single legend is necessary. In order for legends to be merged, they must have the same name. So if you change the name of one of the scales, you’ll need to change it for all of them. One way to do this is by using labs() helper function: 13.6.2 Splitting legends Splitting a legend is a much less common data visualization task. In general it is not advisable to map one aesthetic (e.g. colour) to multiple variables, and so by default ggplot2 does not allow you to “split” the colour aesthetic into multiple scales with separate legends. Nevertheless, there are exceptions to this general rule, and it is possible to override this behaviour using the ggnewscale package. The ggnewscale::new_scale_colour() command acts as an instruction to ggplot2 to initialize a new colour scale: scale and guide commands that appear above the new_scale_colour() command will be applied to the first colour scale, and commands that appear below are applied to the second colour scale. To illustrate this the plot on the left uses geom_point() to display a large marker for each vehicle make in the mpg data, with a single colour scale that maps to the year. On the right, a second geom_point() layer is overlaid on the plot using small markers: this layer is associated with a different colour scale, used to indicate whether the vehicle has a 4-cylinder engine. Additional details, including functions that apply to other scale types, are available on the package website, https://github.com/eliocamp/ggnewscale. "],["legend-key-glyphs.html", "13.7 Legend key glyphs", " 13.7 Legend key glyphs In most cases the default glyphs shown in the legend key will be appropriate to the layer and the aesthetic. Should you need to override this behaviour, the key_glyph argument can be used to associate a particular layer with a different kind of glyph. For example: More precisely, each geom is associated with a function such as draw_key_path(), draw_key_boxplot() or draw_key_path() which is responsible for drawing the key when the legend is created. You can pass the desired key drawing function directly: for example, base + geom_line(key_glyph = draw_key_timeseries) would also produce the plot shown above. For more information about changing key glyphs, see https://www.emilhvitfeldt.com/post/changing-glyph-in-ggplot2/. "],["meeting-videos-13.html", "13.8 Meeting Videos", " 13.8 Meeting Videos 13.8.1 Cohort 1 Meeting chat log 00:07:09 June Choe: hello! 00:08:55 Federica Gazzelloni: Hello! 00:46:16 Kent Johnson: Examples of key glyphs: https://www.emilhvitfeldt.com/post/changing-glyph-in-ggplot2/ 00:48:30 June Choe: that one is just two overlapping points i think (with different sizes) 00:48:35 June Choe: (yes what kent said) "],["coordinate-systems.html", "Chapter 14 Coordinate systems", " Chapter 14 Coordinate systems Learning objectives: What are coord_<functions> ? What are the differences between coord_<functions> in {ggplot2} ? How to use coordinate systems in {ggplot2} "],["introduction-5.html", "14.1 Introduction", " 14.1 Introduction The coordinate system in {ggplot2} can be managed with the use of coord_<functions>. This is done when we need to: zoom into a plot in a particular area of the plot flip the axis of a plot set a fixed aspect ratio of a plot transform coordinates change the shape of the plot set the coordinates for a map projection library(tidyverse) library(patchwork) iris %>% head() Sepal.Length Sepal.Width Petal.Length Petal.Width Species 1 5.1 3.5 1.4 0.2 setosa 2 4.9 3.0 1.4 0.2 setosa 3 4.7 3.2 1.3 0.2 setosa 4 4.6 3.1 1.5 0.2 setosa 5 5.0 3.6 1.4 0.2 setosa 6 5.4 3.9 1.7 0.4 setosa "],["linear-coordinate-systems.html", "14.2 Linear coordinate systems", " 14.2 Linear coordinate systems coord_cartesian(): the default Cartesian coordinate system, where the 2d position of an element is given by the combination of the x and y positions. coord_flip(): Cartesian coordinate system with x and y axes flipped. coord_fixed(): Cartesian coordinate system with a fixed aspect ratio. coord_cartesian() p1 <- ggplot(iris, aes(Sepal.Length, Sepal.Width)) + geom_point(aes(fill=Species), show.legend = F, shape=21,color="grey20",alpha=0.5) + geom_smooth(color="pink") + theme_light() p1 | p1 + scale_x_continuous(limits = c(5, 6)) | p1 + coord_cartesian(xlim = c(5, 6)) coord_flip() p2 <- ggplot(iris, aes(Sepal.Length, Sepal.Width)) + geom_point(aes(fill=Species), show.legend = F, shape=21,color="grey20",alpha=0.5) + geom_smooth(color="pink") + theme_light() p3 <- ggplot(iris, aes(Sepal.Width,Sepal.Length)) + geom_point(aes(fill=Species), show.legend = F, shape=21,color="grey20",alpha=0.5) + geom_smooth(color="pink") + theme_light() p2 | p2 + coord_flip() | p3 (the smooth is fit to the rotated data). coord_fixed() p3 | p3 + coord_fixed() "],["non-linear-coordinate-systems.html", "14.3 Non-linear coordinate systems", " 14.3 Non-linear coordinate systems coord_polar(): Polar coordinates. coord_map()/coord_quickmap()/coord_sf(): Map projections. coord_trans(): Apply arbitrary transformations to x and y positions, after the data has been processed by the stat. coord_polar() p4 <- iris %>% ggplot(aes(x = Species, y = Petal.Width)) + geom_col(aes(color=Species,fill=Species),show.legend = F)+ theme_light() p4 + coord_polar(theta = "x") | p4 + coord_polar(theta = "y") 14.3.1 Example: Coord_polar() with DuBoisChallenge N°8 data source: DuBois data portraits df <- read_csv("https://raw.githubusercontent.com/ajstarks/dubois-data-portraits/master/challenge/2022/challenge08/data.csv") df2 <- df %>% arrange(-Year) df2[7,1] <- 1875 df2[7,2] <- 0 df2[7,3] <- 0 df2 %>% ggplot() + geom_line(data= subset(df2, Year %in% c(1875,1875)), mapping = aes(x=Year, y= `Houshold Value (Dollars)`), color="#FFCDCB",size=6) + geom_line(data= subset(df2, Year%in%c(1875,1875,1880)), mapping= aes(x=Year +2, y= `Houshold Value (Dollars)`), color="#989EB4",size=6) + geom_line(data= subset(df2, Year%in%c(1875,1875,1880,1885)), mapping= aes(x=Year +4, y= `Houshold Value (Dollars)`), color="#b08c71",size=6) + geom_line(data= subset(df2, Year%in%c(1875,1875,1880,1885,1890)), mapping= aes(x=Year +6, y= `Houshold Value (Dollars)`), color="#FFC942",size=6) + geom_line(data= subset(df2, Year%in%c(1875,1875,1880,1885,1890,1895)), mapping= aes(x=Year +8, y= `Houshold Value (Dollars)`), color="#EFDECC", size=6) + geom_line(mapping= aes(x=Year +10, y= `Houshold Value (Dollars)`), color="#F02C49",size=6) + coord_polar(theta = "y", start = 0, direction = 1, clip = "off") + # other scales that can be used: #scale_x_reverse(expand=expansion(mult=c(-0.9,-0.1),add=c(29,-0.1))) + #scale_y_continuous(expand=expansion(mult=c(0.09,0.01),add=c(0,-790000))) + scale_x_reverse(expand=expansion(add=c(11,-5))) + scale_y_continuous(expand=expansion(add=c(0,-600000))) + labs(title="ASSESSED VALUE OF HOUSEHOLD AND KITCHEN FURNITURE OWNED BY GEORGIA NEGROES.")+ theme_void() + theme(text = element_text(face="bold", color="grey27"), aspect.ratio =2/1.9, #y/x plot.background = element_rect(color= "#d9ccbf", fill= "#d9ccbf"), plot.title = element_text(hjust=0.5,size=9)) coord_trans() rect <- data.frame(x = 50, y = 50) line <- data.frame(x = c(1, 200), y = c(100, 1)) p6 <- ggplot(mapping = aes(x, y)) + geom_tile(data = rect, aes(width = 50, height = 50)) + geom_line(data = line) + xlab(NULL) + ylab(NULL) p6 p6 + coord_trans(y = "log10") p7 <- ggplot(iris, aes(Sepal.Length, Petal.Length)) + stat_bin2d() + geom_smooth(method = "lm") + xlab(NULL) + ylab(NULL) + theme(legend.position = "none") p7 #> `geom_smooth()` using formula 'y ~ x' # Better fit on log scale, but harder to interpret p7 + scale_x_log10() + scale_y_log10() #> `geom_smooth()` using formula 'y ~ x' # Fit on log scale, then backtransform to original. # Highlights lack of expensive diamonds with large carats pow10 <- scales::exp_trans(10) p7 + scale_x_log10() + scale_y_log10() + coord_trans(x = pow10, y = pow10) coord_map()/coord_quickmap()/coord_sf() world <- map_data("world") worldmap <- ggplot(world, aes(long, lat, group = group)) + geom_path() + scale_y_continuous(NULL, breaks = (-2:3) * 30, labels = NULL) + scale_x_continuous(NULL, breaks = (-4:4) * 45, labels = NULL) worldmap + coord_quickmap() | worldmap + coord_map("ortho") | worldmap + coord_map("stereographic") "],["meeting-videos-14.html", "14.4 Meeting Videos", " 14.4 Meeting Videos 14.4.1 Cohort 1 Meeting chat log 00:08:33 June Choe: hi all :) 00:08:50 Federica Gazzelloni: Hi 00:09:48 June Choe: yeah I think folks can catch up on youtube maybe 00:28:00 June Choe: thats very neat - didn't know you could "squish" the polar-transformed shapes with scale expansion 00:38:32 June Choe: An interesting discussion for coord_polar on twitter - https://twitter.com/mattansb/status/1506620436771229715?s=20&t=I4IebpuwA_ZxDwzA4BqqwQ 00:38:45 June Choe: I was in an exchange with @mattansb on how to "crop" polar coordinate plots 00:39:15 June Choe: this was his solution, and I find it quite nice - https://mattansb.github.io/MSBMisc/reference/crop_coord_polar.html 00:40:30 June Choe: this was great - thank you! 00:41:03 June Choe: sounds good! "],["faceting-2.html", "Chapter 15 Faceting", " Chapter 15 Faceting Learning objectives: Facet wrap Facet grid Controlling scales Missing faceting variables Grouping vs. faceting Continuous variables "],["facets.html", "15.1 Facets", " 15.1 Facets library(tidyverse) mpg2 <- subset(mpg, cyl != 5 & drv %in% c("4", "f") & class != "2seater") base <- ggplot(mpg2, aes(displ, hwy)) + geom_blank() + xlab(NULL) + ylab(NULL) mpg2%>%count(class) # A tibble: 6 × 2 class n <chr> <int> 1 compact 45 2 midsize 41 3 minivan 11 4 pickup 33 5 subcompact 24 6 suv 51 base + facet_wrap(~class, ncol = 3) base + facet_wrap(~class, ncol = 3, as.table = FALSE) base + facet_wrap(~class, nrow = 3) base + facet_wrap(~class, nrow = 3, dir = "v") base + facet_grid(. ~ cyl) base + facet_grid(drv ~ .) base + facet_grid(drv ~ cyl) p <- ggplot(mpg, aes(cty, hwy)) + geom_abline() + geom_jitter(width = 0.1, height = 0.1) p + facet_grid(drv ~ cyl) facet_wrap(~cyl) <ggproto object: Class FacetWrap, Facet, gg> compute_layout: function draw_back: function draw_front: function draw_labels: function draw_panels: function finish_data: function init_scales: function map_data: function params: list setup_data: function setup_params: function shrink: TRUE train_scales: function vars: function super: <ggproto object: Class FacetWrap, Facet, gg> p+ facet_wrap(~cyl, scales = "free_y") economics_long%>%count(date) # A tibble: 574 × 2 date n <date> <int> 1 1967-07-01 5 2 1967-08-01 5 3 1967-09-01 5 4 1967-10-01 5 5 1967-11-01 5 6 1967-12-01 5 7 1968-01-01 5 8 1968-02-01 5 9 1968-03-01 5 10 1968-04-01 5 # ℹ 564 more rows ggplot(economics_long, aes(date, value)) + geom_line() + facet_wrap(~variable, scales = "free_y", ncol = 1) mpg2$model <- reorder(mpg2$model, mpg2$cty) mpg2$manufacturer <- reorder(mpg2$manufacturer, -mpg2$cty) ggplot(mpg2, aes(cty, model)) + geom_point() + facet_grid(manufacturer ~ ., scales = "free", space = "free") + theme(strip.text.y = element_text(angle = 0)) df1 <- data.frame(x = 1:3, y = 1:3, gender = c("f", "f", "m")) df2 <- data.frame(x = 2, y = 2) ggplot(df1, aes(x, y)) + geom_point(data = df2, colour = "red", size = 2) + geom_point() + facet_wrap(~gender) df <- data.frame( x = rnorm(120, c(0, 2, 4)), y = rnorm(120, c(1, 2, 1)), z = letters[1:3] ) ggplot(df, aes(x, y)) + geom_point(aes(colour = z)) ggplot(df, aes(x, y)) + geom_point(aes(color=z)) + facet_wrap(~z) df_sum <- df %>% group_by(z) %>% summarise(x = mean(x), y = mean(y)) %>% rename(z2 = z) ggplot(df, aes(x, y)) + geom_point() + geom_point(data = df_sum, aes(colour = z2), size = 4) + facet_wrap(~z) df2 <- dplyr::select(df, -z) ggplot(df, aes(x, y)) + geom_point(data = df2, colour = "grey70") + geom_point(aes(colour = z)) + facet_wrap(~z) age<-seq(18,60,1) id <- seq(1,42,1) my_df <- as.data.frame(cbind(id,age)) my_df %>% mutate(age_cat=cut_interval(age,length=5))%>%head() id age age_cat 1 1 18 [15,20] 2 2 19 [15,20] 3 3 20 [15,20] 4 4 21 (20,25] 5 5 22 (20,25] 6 6 23 (20,25] # Bins of width 1 mpg2$disp_w <- cut_width(mpg2$displ, 1) # Six bins of equal length mpg2$disp_i <- cut_interval(mpg2$displ, 6) # Six bins containing equal numbers of points mpg2$disp_n <- cut_number(mpg2$displ, 6) plot <- ggplot(mpg2, aes(cty, hwy)) + geom_point() + labs(x = NULL, y = NULL) plot + facet_wrap(~disp_w, nrow = 1) "],["meeting-videos-15.html", "15.2 Meeting Videos", " 15.2 Meeting Videos 15.2.1 Cohort 1 "],["themes.html", "Chapter 16 Themes", " Chapter 16 Themes Learning objectives: How can I customize the output of my plot What are the functions theme_<function>() and theme() "],["theme.html", "16.1 Theme", " 16.1 Theme Plots can be customized by adding these function to your plot: scale_fill/color_ theme_ theme() … 16.1.1 Complete themes In ggplo2 there are preset themes ready to use: library(tidyverse) df <- data.frame(x = 1:3, y = 1:3) base <- ggplot(df, aes(x, y)) + geom_point() p1<-base + theme_grey() + ggtitle("theme_grey()") p2<-base + theme_bw() + ggtitle("theme_bw()") p3<-base + theme_linedraw() + ggtitle("theme_linedraw()") library(patchwork) p1+p2+p3 p4<-base + theme_light() + ggtitle("theme_light()") p5<- base + theme_dark() + ggtitle("theme_dark()") p6<-base + theme_minimal() + ggtitle("theme_minimal()") p4+p5+p6 p7<-base + theme_classic() + ggtitle("theme_classic()") p8<-base + theme_void() + ggtitle("theme_void()") p7+p8 Or, you can use other packages such as {ggthemes} or other here: ggplot extension gallery library(ggthemes) p9<-base + theme_tufte() + ggtitle("theme_tufte()") p10<-base + theme_solarized() + ggtitle("theme_solarized()") p11<-base + theme_excel() + ggtitle("theme_excel()") p9+p10+p11 Modifying complete theme components with theme() function "],["plot-elements-of-a-theme.html", "16.2 Plot elements of a theme", " 16.2 Plot elements of a theme Axis elements Legend elements Panel elements Faceting elements Look at ?theme() funtion in your help pane of RStudio for more info. "],["meeting-videos-16.html", "16.3 Meeting Videos", " 16.3 Meeting Videos 16.3.1 Cohort 1 "],["programming-with-ggplot2.html", "Chapter 17 Programming with ggplot2", " Chapter 17 Programming with ggplot2 Learning objectives: Programming single and multiple components Use components, annotation, and additional arguments in a plot Functional programming What are the components of a plot? data.frame aes() Scales Coords systems Theme components "],["programming-single-and-multiple-components.html", "17.1 Programming single and multiple components", " 17.1 Programming single and multiple components In ggplot2 it is possible to build up plot components easily. This is a good practice to reduce duplicated code. Generalising code allows you with more flexibility when making customised plots. 17.1.1 Components One example of a component of a plot is this one below: bestfit <- geom_smooth( method = "lm", se = FALSE, colour = alpha("steelblue", 0.5), size = 2) This single component can be placed inside the syntax of the grammar of graphics and used as a plot layer. ggplot(mpg, aes(cty, hwy)) + geom_point() + bestfit Another way is to bulid a layer passing through build a function: geom_lm <- function(formula = y ~ x, colour = alpha("steelblue", 0.5), size = 2, ...) { geom_smooth(formula = formula, se = FALSE, method = "lm", colour = colour, size = size, ...) } And the apply the function layer to the plot ggplot(mpg, aes(displ, 1 / hwy)) + geom_point() + geom_lm(y ~ poly(x, 2), size = 1, colour = "red") The book points out attention to the “open” parameter …. A suggestion is to use it inside the function instead of in the function parameters definition. Instead of only one component, we can build a plot made of more components. geom_mean <- function() { list( stat_summary(fun = "mean", geom = "bar", fill = "grey70"), stat_summary(fun.data = "mean_cl_normal", geom = "errorbar", width = 0.4) ) } Whit this result: ggplot(mpg, aes(class, cty)) + geom_mean() "],["use-components-annotation-and-additional-arguments-in-a-plot.html", "17.2 Use components, annotation, and additional arguments in a plot", " 17.2 Use components, annotation, and additional arguments in a plot We have just seen some examples on how to make new components, what if we want to know more about existing components? As an example the borders() option function, provided by {ggplot2} to create a layer of map borders. “A quick and dirty way to get map data (from the maps package) on to your plot.” borders <- function(database = "world", regions = ".", fill = NA, colour = "grey50", ...) { df <- map_data(database, regions) geom_polygon( aes_(~long, ~lat, group = ~group), data = df, fill = fill, colour = colour, ..., inherit.aes = FALSE, show.legend = FALSE ) } library(maps) data(us.cities) capitals <- subset(us.cities, capital == 2) ggplot(capitals, aes(long, lat)) + borders("world", xlim = c(-130, -60), ylim = c(20, 50)) + geom_point(aes(size = pop)) + scale_size_area() + coord_quickmap() We can even add addtional arguments, such as those ones to modify and add things: modifyList() do.call() geom_mean <- function(..., bar.params = list(), errorbar.params = list()) { params <- list(...) bar.params <- modifyList(params, bar.params) errorbar.params <- modifyList(params, errorbar.params) bar <- do.call("stat_summary", modifyList( list(fun = "mean", geom = "bar", fill = "grey70"), bar.params) ) errorbar <- do.call("stat_summary", modifyList( list(fun.data = "mean_cl_normal", geom = "errorbar", width = 0.4), errorbar.params) ) list(bar, errorbar) } And here is the result: ggplot(mpg, aes(class, cty)) + geom_mean( colour = "steelblue", errorbar.params = list(width = 0.5, size = 1) ) "],["functional-programming.html", "17.3 Functional programming", " 17.3 Functional programming An example is to make a geom. For this we can have a look at the “Corporate Reputation” data from #TidyTuesday 2022 week22. poll <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-05-31/poll.csv') reputation <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-05-31/reputation.csv') rep2<-reputation%>% group_by(company,industry)%>% summarize(score,rank)%>% ungroup()%>% mutate(year=2022) full <- poll%>% filter(!is.na(year))%>% full_join(rep2,by=c("2022_rank"="rank","2022_rq"="score","company","industry","year")) %>% count(year,company,industry,"rank"=`2022_rank`,"score"=`2022_rq`,sort=T) %>% arrange(-year) ################## # mapping = aes(x = fct_reorder(x,-y), y = y, fill = y, color = y, label = y) rank_plot <- function(data,mapping) { data %>% ggplot(mapping)+ # aes(x=fct_reorder(x,-y),y=y) geom_col(width =0.3, # aes(fill=rank) show.legend = F)+ geom_text(hjust=0,fontface="bold", # aes(label=rank,color=rank), show.legend = F)+ scale_y_discrete(expand = c(0, 0, .5, 0))+ coord_flip()+ ggthemes::scale_fill_continuous_tableau(palette = "Green-Gold")+ ggthemes::scale_color_continuous_tableau(palette = "Green-Gold")+ labs(title="", x="",y="")+ theme(axis.text.x = element_blank(), axis.text.y = element_text(face="bold"), axis.ticks.x = element_blank(), axis.ticks.y = element_line(size=2), panel.grid.major.x = element_blank(), panel.grid.minor.x = element_blank(), panel.grid.major.y = element_line(size=2), plot.background = element_rect(color="grey95",fill="grey95"), panel.background = element_rect(color="grey92",fill="grey92")) } df<-full%>% filter(year==2017, industry=="Retail") rank_plot(data = df, mapping = aes(x=fct_reorder(company,-rank),y=rank, fill = rank, color = rank, label = rank)) "],["references.html", "17.4 References", " 17.4 References extending ggplot2 functions expressions functional programming advanced R - functionals "],["meeting-videos-17.html", "17.5 Meeting Videos", " 17.5 Meeting Videos 17.5.1 Cohort 1 Meeting chat log 00:41:31 Priyanka Gagneja: There’s a lot of disturbance :( 01:00:48 Priyanka Gagneja: https://plotly.com/ggplot2/setting-graph-size/ "],["internals-of-ggplot2.html", "Chapter 18 Internals of ggplot2", " Chapter 18 Internals of ggplot2 Learning Objectives What is the difference between user-facing code and internal code? What is the distinction between ggplot_build() and ggplot_gtable()? What the division of labor between {ggplot2} and {grid}? What is the basic structure of/motivation for ggproto? library(ggplot2) library(ggtrace) # remotes::install_github("yjunechoe/ggtrace") library(purrr) library(dplyr) 18.0.0.1 Introduction (the existence of internals) The user-facing code that defines a ggplot on the surface is not the same as the internal code that creates a ggplot under the hood. In this chapter, we’ll learn about how the internal code operates and develop some intuitions about thinking about the internals, starting with these two simple examples of mismatches between surface and underlying form: 18.0.0.2 Case 1: Order You can change the order of some “layers” without change to the graphical output. For example, scale_*() can be added anywhere and always ends up applying for the whole plot: ggplot(mtcars, aes(mpg, hp, color = as.factor(am))) + scale_x_log10() + #< scale first geom_point() + geom_smooth() ggplot(mtcars, aes(mpg, hp, color = as.factor(am))) + geom_point() + scale_x_log10() + #< scale middle geom_smooth() Though the order of geom_*() and stat_*() matters for order of drawing: ggplot(mtcars, aes(mpg, hp, color = as.factor(am))) + geom_point() + geom_smooth(fill = "black") ggplot(mtcars, aes(mpg, hp, color = as.factor(am))) + geom_smooth(fill = "black") + geom_point() 18.0.0.3 Case 2: Modularity We know that user-facing “layer” code that we add to a ggplot with + are stand-alone functions: lm_smooth <- geom_smooth(method = "lm", formula = y ~ x) lm_smooth geom_smooth: na.rm = FALSE, orientation = NA, se = TRUE stat_smooth: na.rm = FALSE, orientation = NA, se = TRUE, method = lm, formula = y ~ x position_identity When we add this object to different ggplots, it materializes in different ways: ggplot(mtcars, aes(mpg, hp)) + lm_smooth ggplot(mtcars, aes(wt, disp)) + lm_smooth "],["the-plot-method.html", "18.1 The plot() method", " 18.1 The plot() method The user-facing code and internal code is also separated by when they are evaluated. The user-facing code like geom_smooth() is evaluated immediately to give you a ggplot object, but the internal code is only evaluated when a ggplot object is printed or plotted, via print() and plot(). The following code simply creates a ggplot object from user-facing code, and DOES NOT print or plot the ggplot (yet). p <- ggplot(mpg, aes(displ, hwy, color = drv)) + geom_point(position = position_jitter(seed = 2022)) + geom_smooth(method = "lm", formula = y ~ x) + facet_wrap(vars(year)) + ggtitle("A plot for expository purposes") The ggplot object is actually just a list under the hood: class(p) [1] "gg" "ggplot" typeof(p) [1] "list" Evaluating the ggplot is what gives you the actual points, rectangles, text, etc. that make up the figure (and you can also do so explicitly with print()/plot()) p # print(p) # plot(p) These are two separate processes, but we often think of them as one monolithic process: defining_benchmark <- bench::mark( # Evaluates user-facing code to define ggplot, # but does not call plot/print method p <- ggplot(mpg, aes(displ, hwy, color = drv)) + geom_point(position = position_jitter(seed = 2022)) + geom_smooth(method = "lm", formula = y ~ x) + facet_wrap(vars(year)) + ggtitle("A plot for expository purposes") ) plotting_benchmark <- bench::mark( # Plots the ggplot plot(p) ) bind_rows( defining_benchmark[,2:5], plotting_benchmark[,2:5] ) # A tibble: 2 × 4 min median `itr/sec` mem_alloc <bch:tm> <bch:tm> <dbl> <bch:byt> 1 3.06ms 3.34ms 295. 20.36KB 2 231.52ms 233.04ms 4.29 3.51MB The plot that gets rendered from a ggplot object is actually a side effect of evaluating the ggplot object: # Same as ggplot2:::print.ggplot ggplot2:::plot.ggplot function (x, newpage = is.null(vp), vp = NULL, ...) { set_last_plot(x) if (newpage) grid.newpage() grDevices::recordGraphics(requireNamespace("ggplot2", quietly = TRUE), list(), getNamespace("ggplot2")) data <- ggplot_build(x) gtable <- ggplot_gtable(data) if (is.null(vp)) { grid.draw(gtable) } else { if (is.character(vp)) seekViewport(vp) else pushViewport(vp) grid.draw(gtable) upViewport() } if (isTRUE(getOption("BrailleR.VI")) && rlang::is_installed("BrailleR")) { print(asNamespace("BrailleR")$VI(x)) } invisible(x) } <bytecode: 0x5567abddec40> <environment: namespace:ggplot2> The above code can be simplified to this: ggprint <- function(x) { data <- ggplot_build(x) gtable <- ggplot_gtable(data) grid::grid.newpage() grid::grid.draw(gtable) return(invisible(x)) #< hence "side effect" } ggprint(p) Roughly put, you first start out as the ggplot object, which then gets passed to ggplot_build(), result of which in turn gets passed to ggplot_gtable() and finally drawn with {grid} library(grid) grid.newpage() # Clear display p %>% ggplot_build() %>% # 1. data for each layer is prepared for drawing ggplot_gtable() %>% # 2. drawing-ready data is turned into graphical elements grid.draw() # 3. graphical elements are converted to an image At each step, you get closer to the low-level information you need to draw the actual plot obj_byte <- function(x) { scales::label_bytes()(as.numeric(object.size(x))) } # ggplot object p %>% obj_byte() [1] "32 kB" # data used to make graphical elements ggplot_build(p) %>% obj_byte() [1] "102 kB" # graphical elements for the plot ggplot_gtable(ggplot_build(p)) %>% obj_byte() [1] "684 kB" # the rendered plot ggsave( filename = tempfile(fileext = ".png"), plot = ggplot_gtable(ggplot_build(p)), # File size depends on format, dimension, resolution, etc. ) %>% file.size() %>% {scales::label_bytes()(.)} [1] "243 kB" The rest of the chapter focuses what happens in this pipeine - the ggplot_build() step and the ggplot_gtable() step. "],["the-build-step.html", "18.2 The build step", " 18.2 The build step This is the function body of ggplot_build(): ggplot2:::ggplot_build.ggplot function (plot) { plot <- plot_clone(plot) if (length(plot$layers) == 0) { plot <- plot + geom_blank() } layers <- plot$layers data <- rep(list(NULL), length(layers)) scales <- plot$scales data <- by_layer(function(l, d) l$layer_data(plot$data), layers, data, "computing layer data") data <- by_layer(function(l, d) l$setup_layer(d, plot), layers, data, "setting up layer") layout <- create_layout(plot$facet, plot$coordinates, plot$layout) data <- layout$setup(data, plot$data, plot$plot_env) data <- by_layer(function(l, d) l$compute_aesthetics(d, plot), layers, data, "computing aesthetics") data <- .ignore_data(data) data <- lapply(data, scales$transform_df) scale_x <- function() scales$get_scales("x") scale_y <- function() scales$get_scales("y") layout$train_position(data, scale_x(), scale_y()) data <- layout$map_position(data) data <- .expose_data(data) data <- by_layer(function(l, d) l$compute_statistic(d, layout), layers, data, "computing stat") data <- by_layer(function(l, d) l$map_statistic(d, plot), layers, data, "mapping stat to aesthetics") plot$scales$add_missing(c("x", "y"), plot$plot_env) data <- by_layer(function(l, d) l$compute_geom_1(d), layers, data, "setting up geom") data <- by_layer(function(l, d) l$compute_position(d, layout), layers, data, "computing position") data <- .ignore_data(data) layout$reset_scales() layout$train_position(data, scale_x(), scale_y()) layout$setup_panel_params() data <- layout$map_position(data) layout$setup_panel_guides(plot$guides, plot$layers) npscales <- scales$non_position_scales() if (npscales$n() > 0) { lapply(data, npscales$train_df) plot$guides <- plot$guides$build(npscales, plot$layers, plot$labels, data) data <- lapply(data, npscales$map_df) } else { plot$guides <- plot$guides$get_custom() } data <- .expose_data(data) data <- by_layer(function(l, d) l$compute_geom_2(d), layers, data, "setting up geom aesthetics") data <- by_layer(function(l, d) l$finish_statistics(d), layers, data, "finishing layer stat") data <- layout$finish_data(data) plot$labels$alt <- get_alt_text(plot) structure(list(data = data, layout = layout, plot = plot), class = "ggplot_built") } <bytecode: 0x55679c06a548> <environment: namespace:ggplot2> It takes the ggplot object as input, and transforms the user-provided data to a drawing-ready data (+ some other auxiliary/meta-data like information about the layout). You can see that the drawing-ready data data is built up incrementally (much like data wrangling minus pipes): as.list(body(ggplot2:::ggplot_build.ggplot)) [[1]] `{` [[2]] plot <- plot_clone(plot) [[3]] if (length(plot$layers) == 0) { plot <- plot + geom_blank() } [[4]] layers <- plot$layers [[5]] data <- rep(list(NULL), length(layers)) [[6]] scales <- plot$scales [[7]] data <- by_layer(function(l, d) l$layer_data(plot$data), layers, data, "computing layer data") [[8]] data <- by_layer(function(l, d) l$setup_layer(d, plot), layers, data, "setting up layer") [[9]] layout <- create_layout(plot$facet, plot$coordinates, plot$layout) [[10]] data <- layout$setup(data, plot$data, plot$plot_env) [[11]] data <- by_layer(function(l, d) l$compute_aesthetics(d, plot), layers, data, "computing aesthetics") [[12]] data <- .ignore_data(data) [[13]] data <- lapply(data, scales$transform_df) [[14]] scale_x <- function() scales$get_scales("x") [[15]] scale_y <- function() scales$get_scales("y") [[16]] layout$train_position(data, scale_x(), scale_y()) [[17]] data <- layout$map_position(data) [[18]] data <- .expose_data(data) [[19]] data <- by_layer(function(l, d) l$compute_statistic(d, layout), layers, data, "computing stat") [[20]] data <- by_layer(function(l, d) l$map_statistic(d, plot), layers, data, "mapping stat to aesthetics") [[21]] plot$scales$add_missing(c("x", "y"), plot$plot_env) [[22]] data <- by_layer(function(l, d) l$compute_geom_1(d), layers, data, "setting up geom") [[23]] data <- by_layer(function(l, d) l$compute_position(d, layout), layers, data, "computing position") [[24]] data <- .ignore_data(data) [[25]] layout$reset_scales() [[26]] layout$train_position(data, scale_x(), scale_y()) [[27]] layout$setup_panel_params() [[28]] data <- layout$map_position(data) [[29]] layout$setup_panel_guides(plot$guides, plot$layers) [[30]] npscales <- scales$non_position_scales() [[31]] if (npscales$n() > 0) { lapply(data, npscales$train_df) plot$guides <- plot$guides$build(npscales, plot$layers, plot$labels, data) data <- lapply(data, npscales$map_df) } else { plot$guides <- plot$guides$get_custom() } [[32]] data <- .expose_data(data) [[33]] data <- by_layer(function(l, d) l$compute_geom_2(d), layers, data, "setting up geom aesthetics") [[34]] data <- by_layer(function(l, d) l$finish_statistics(d), layers, data, "finishing layer stat") [[35]] data <- layout$finish_data(data) [[36]] plot$labels$alt <- get_alt_text(plot) [[37]] structure(list(data = data, layout = layout, plot = plot), class = "ggplot_built") 18.2.1 Data preparation The data from the ggplot is prepared in a special format for each layer (essentially, just a dataframe with a predictable set of column names). A layer (specifically, the output of ggplot2::layer()) can provide data in one of three ways: Inherited from the data supplied to ggplot() Supplied directly from the layer’s data argument A function that returns a data when applied to the global data data_demo_p <- ggplot(mtcars, aes(disp, cyl)) + # 1) Inherited data geom_point(color = "blue") + # 2) Data supplied directly geom_point( color = "red", alpha = .2, data = mpg %>% mutate(disp = displ * 100) ) + # 3) Function to be applied to inherited data geom_label( aes(label = paste("cyl:", cyl)), data = . %>% group_by(cyl) %>% summarize(disp = mean(disp)) ) data_demo_p Inside the layers element of the ggplot are Layer objects which hold information about each layer: data_demo_p$layers map( data_demo_p$layers, class ) And the calculated data from each layer can be accessed with layer_data() method of the Layer object: ggplot2:::Layer$layer_data <ggproto method> <Wrapper function> function (...) layer_data(..., self = self) <Inner function (f)> function (self, plot_data) { if (is.waive(self$data)) { data <- plot_data } else if (is.function(self$data)) { data <- self$data(plot_data) if (!is.data.frame(data)) { cli::cli_abort("{.fn layer_data} must return a {.cls data.frame}.") } } else { data <- self$data } if (is.null(data) || is.waive(data)) data else unrowname(data) } data_demo_p$layers[[1]]$layer_data(data_demo_p$data) mpg cyl disp hp drat wt qsec vs am gear carb 1 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4 2 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4 3 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1 4 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1 5 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2 6 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1 7 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4 8 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2 9 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2 10 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4 11 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4 12 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3 13 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3 14 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3 15 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4 16 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4 17 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4 18 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1 19 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2 20 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1 21 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1 22 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2 23 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2 24 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4 25 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2 26 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1 27 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2 28 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2 29 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4 30 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6 31 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8 32 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2 data_demo_p$layers[[2]]$layer_data(data_demo_p$data) # A tibble: 234 × 12 manufacturer model displ year cyl trans drv cty hwy fl class <chr> <chr> <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr> 1 audi a4 1.8 1999 4 auto… f 18 29 p comp… 2 audi a4 1.8 1999 4 manu… f 21 29 p comp… 3 audi a4 2 2008 4 manu… f 20 31 p comp… 4 audi a4 2 2008 4 auto… f 21 30 p comp… 5 audi a4 2.8 1999 6 auto… f 16 26 p comp… 6 audi a4 2.8 1999 6 manu… f 18 26 p comp… 7 audi a4 3.1 2008 6 auto… f 18 27 p comp… 8 audi a4 quattro 1.8 1999 4 manu… 4 18 26 p comp… 9 audi a4 quattro 1.8 1999 4 auto… 4 16 25 p comp… 10 audi a4 quattro 2 2008 4 manu… 4 20 28 p comp… # ℹ 224 more rows # ℹ 1 more variable: disp <dbl> data_demo_p$layers[[3]]$layer_data(data_demo_p$data) # A tibble: 3 × 2 cyl disp <dbl> <dbl> 1 4 105. 2 6 183. 3 8 353. This is where the data transformation journey begins inside the plot method: body(ggplot2:::ggplot_build.ggplot)[[5]] data <- rep(list(NULL), length(layers)) body(ggplot2:::ggplot_build.ggplot)[[8]] data <- by_layer(function(l, d) l$setup_layer(d, plot), layers, data, "setting up layer") For data_demo_p, the data variable after step 8 looks like this: ggtrace_inspect_vars( x = data_demo_p, method = ggplot2:::ggplot_build.ggplot, at = 9, vars = "data" ) [[1]] mpg cyl disp hp drat wt qsec vs am gear carb 1 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4 2 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4 3 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1 4 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1 5 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2 6 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1 7 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4 8 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2 9 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2 10 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4 11 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4 12 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3 13 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3 14 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3 15 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4 16 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4 17 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4 18 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1 19 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2 20 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1 21 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1 22 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2 23 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2 24 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4 25 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2 26 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1 27 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2 28 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2 29 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4 30 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6 31 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8 32 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2 [[2]] # A tibble: 234 × 12 manufacturer model displ year cyl trans drv cty hwy fl class <chr> <chr> <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr> 1 audi a4 1.8 1999 4 auto… f 18 29 p comp… 2 audi a4 1.8 1999 4 manu… f 21 29 p comp… 3 audi a4 2 2008 4 manu… f 20 31 p comp… 4 audi a4 2 2008 4 auto… f 21 30 p comp… 5 audi a4 2.8 1999 6 auto… f 16 26 p comp… 6 audi a4 2.8 1999 6 manu… f 18 26 p comp… 7 audi a4 3.1 2008 6 auto… f 18 27 p comp… 8 audi a4 quattro 1.8 1999 4 manu… 4 18 26 p comp… 9 audi a4 quattro 1.8 1999 4 auto… 4 16 25 p comp… 10 audi a4 quattro 2 2008 4 manu… 4 20 28 p comp… # ℹ 224 more rows # ℹ 1 more variable: disp <dbl> [[3]] # A tibble: 3 × 2 cyl disp <dbl> <dbl> 1 4 105. 2 6 183. 3 8 353. For the expository plot p from the book, the data variable after step 8 looks like the original mpg data: ggtrace_inspect_vars( x = p, method = ggplot2:::ggplot_build.ggplot, at = 9, vars = "data" ) %>% map(head) 18.2.2 Data transformation 18.2.2.1 PANEL variable and aesthetic mappings Continuing with the book example, the data is augmented with the PANEL variable at Step 11: body(ggplot2:::ggplot_build.ggplot)[[11]] ggtrace_inspect_vars( x = p, method = ggplot2:::ggplot_build.ggplot, at = 12, vars = "data" ) %>% map(head) And then the group variable appears at Step 12, which is also when aesthetics get “mapped” (= just mutate(), essentially): body(ggplot2:::ggplot_build.ggplot)[[12]] ggtrace_inspect_vars( x = p, method = ggplot2:::ggplot_build.ggplot, at = 13, vars = "data" ) %>% map(head) 18.2.2.2 Scales Then, scales are applied in Step 13. This leaves the data unchanged for the original plot: body(ggplot2:::ggplot_build.ggplot)[[13]] ggtrace_inspect_vars( x = p, method = ggplot2:::ggplot_build.ggplot, at = 14, vars = "data" ) %>% map(head) But the effect can be seen with something like scale_x_log10(): ggtrace_inspect_vars( x = p + scale_x_log10(), method = ggplot2:::ggplot_build.ggplot, at = 14, vars = "data" ) %>% map(head, 3) Out-of-bounds handling happens down the line, at Step 17: body(ggplot2:::ggplot_build.ggplot)[[17]] ggtrace_inspect_vars( x = p + xlim(2, 8), # or scale_x_continuous(oob = scales::oob_censor) method = ggplot2:::ggplot_build.ggplot, at = 18, vars = "data" ) %>% map(head, 3) 18.2.2.3 Stat Stat transformation happens right after, at Step 18 (this is why understanding out-of-bounds handling and scale transformation is important!) body(ggplot2:::ggplot_build.ggplot)[[18]] ggtrace_inspect_vars( x = p, method = ggplot2:::ggplot_build.ggplot, at = 19, vars = "data" ) %>% map(head, 3) Note how this point on the data for two layers look different. This is because geom_point() and geom_smooth() have different Stats. class( geom_point()$stat ) [1] "StatIdentity" "Stat" "ggproto" "gg" class( geom_smooth()$stat ) [1] "StatSmooth" "Stat" "ggproto" "gg" 18.2.2.4 Position At Step 22, positions are adjusted (jittering, dodging, stacking, etc.). We gave geom_point() a jitter so we see that reflected for the first layer: body(ggplot2:::ggplot_build.ggplot)[[22]] ggtrace_inspect_vars( x = p, method = ggplot2:::ggplot_build.ggplot, at = 23, vars = "data" ) %>% map(head, 3) 18.2.2.5 Geom Variables relevant for drawing each layer’s geometry are added in by the Geom, at Step 29: body(ggplot2:::ggplot_build.ggplot)[[29]] ggtrace_inspect_vars( x = p, method = ggplot2:::ggplot_build.ggplot, at = 30, vars = "data" ) %>% map(head, 3) 18.2.3 Output The final state of the data after ggplot_build() is stored in the data element of the output of ggplot_build(): ggplot_build(p)$data %>% map(head, 3) ggplot_build() also returns the trained layout of the plot (scales, panels, etc.) in the layout element, as well as the original ggplot object in the plot element: lapply( ggplot_build(p), class ) 18.2.4 Explore The building of p ggtrace_inspect_vars( x = p, method = ggplot2:::ggplot_build.ggplot, vars = "data" ) %>% map(map, head, 3) The making of p + scale_x_log10(limits = c(2, 5), oob = scales::oob_censor) ggtrace_inspect_vars( x = p + scale_x_log10(limits = c(2, 5), oob = scales::oob_censor), method = ggplot2:::ggplot_build.ggplot, vars = "data" ) %>% map(map, head, 3) "],["the-gtable-step.html", "18.3 The gtable step", " 18.3 The gtable step Again, still working with our plot p p <- ggplot(mpg, aes(displ, hwy, color = drv)) + geom_point(position = position_jitter(seed = 2022)) + geom_smooth(method = "lm", formula = y ~ x) + facet_wrap(vars(year)) + ggtitle("A plot for expository purposes") p # print(p) # plot(p) The return value of ggplot_build() contains the computed data associated with each layer and a Layout ggproto object which holds information about data other than the layers, including the scales, coordinate system, facets, etc. names(ggplot_build(p)) [1] "data" "layout" "plot" ggplot_build(p)$data %>% map(head, 3) class(ggplot_build(p)$layout) [1] "Layout" "ggproto" "gg" The output of ggplot_build() is then passed to ggplot_gtable() to be converted into graphical elements before being drawn: ggplot2:::plot.ggplot function (x, newpage = is.null(vp), vp = NULL, ...) { set_last_plot(x) if (newpage) grid.newpage() grDevices::recordGraphics(requireNamespace("ggplot2", quietly = TRUE), list(), getNamespace("ggplot2")) data <- ggplot_build(x) gtable <- ggplot_gtable(data) if (is.null(vp)) { grid.draw(gtable) } else { if (is.character(vp)) seekViewport(vp) else pushViewport(vp) grid.draw(gtable) upViewport() } if (isTRUE(getOption("BrailleR.VI")) && rlang::is_installed("BrailleR")) { print(asNamespace("BrailleR")$VI(x)) } invisible(x) } <bytecode: 0x5567abddec40> <environment: namespace:ggplot2> 18.3.1 Rendering the panels First, each layer is converted into a list of graphical objects (grobs) … body(ggplot2:::ggplot_gtable.ggplot_built)[[6]] geom_grobs <- by_layer(function(l, d) l$draw_geom(d, layout), plot$layers, data, "converting geom to grob") This step draws loops through each layer, taking the layer object l and the data associated with that layer d and using the Geom from the layer to draw the data. geom_grobs <- ggtrace_inspect_vars( x = p, method = ggplot2:::ggplot_gtable.ggplot_built, at = 7, vars = "geom_grobs" ) geom_grobs [[1]] [[1]]$`1` points[geom_point.points.28774] [[1]]$`2` points[geom_point.points.28776] [[2]] [[2]]$`1` gTree[geom_smooth.gTree.28793] [[2]]$`2` gTree[geom_smooth.gTree.28810] The geom_grobs calculated at this step can also be accessed using the layer_grob() function on the ggplot object, which is similar to the layer_data() function: list( layer_grob(p, i = 1), layer_grob(p, i = 2) ) [[1]] [[1]]$`1` points[geom_point.points.28934] [[1]]$`2` points[geom_point.points.28936] [[2]] [[2]]$`1` gTree[geom_smooth.gTree.28953] [[2]]$`2` gTree[geom_smooth.gTree.28970] Each element of geom_grobs is a list of graphical objects representing a layer’s data in a facet. For example, this draws the data plotted by the first layer in the first facet grid.newpage() pushViewport(viewport()) grid.draw(geom_grobs[[1]][[1]]) After this, the facet takes over and assembles the panels… The graphical representation of each layer in each facet are combined with other “non-data” elements of the plot at this step, where the plot_table variable is defined. body(ggplot2:::ggplot_gtable.ggplot_built)[[8]] legend_box <- plot$guides$assemble(theme) plot_table <- ggtrace_inspect_vars( x = p, method = ggplot2:::ggplot_gtable.ggplot_built, at = 9, vars = "plot_table" ) plot_table is a special grob called a gtable, which is the same structure as the final form of the ggplot figure before it’s sent off to the rendering system to get drawn: plot_table TableGrob (6 x 9) "layout": 16 grobs z cells name grob 1 1 (4-4,3-3) panel-1-1 gTree[panel-1.gTree.29022] 2 1 (4-4,7-7) panel-2-1 gTree[panel-2.gTree.29036] 3 3 (2-2,3-3) axis-t-1-1 zeroGrob[NULL] 4 3 (2-2,7-7) axis-t-2-1 zeroGrob[NULL] 5 3 (5-5,3-3) axis-b-1-1 absoluteGrob[GRID.absoluteGrob.29039] 6 3 (5-5,7-7) axis-b-2-1 absoluteGrob[GRID.absoluteGrob.29039] 7 3 (4-4,6-6) axis-l-1-2 zeroGrob[NULL] 8 3 (4-4,2-2) axis-l-1-1 absoluteGrob[GRID.absoluteGrob.29045] 9 3 (4-4,8-8) axis-r-1-2 zeroGrob[NULL] 10 3 (4-4,4-4) axis-r-1-1 zeroGrob[NULL] 11 2 (3-3,3-3) strip-t-1-1 gtable[strip] 12 2 (3-3,7-7) strip-t-2-1 gtable[strip] 13 4 (1-1,3-7) xlab-t zeroGrob[NULL] 14 5 (6-6,3-7) xlab-b titleGrob[axis.title.x.bottom..titleGrob.29095] 15 6 (4-4,1-1) ylab-l titleGrob[axis.title.y.left..titleGrob.29098] 16 7 (4-4,9-9) ylab-r zeroGrob[NULL] When it is first defined, it’s only a partially complete representation of the plot - title, legend, margins, etc. are missing: grid.newpage() grid.draw(plot_table) Recall that plot_table is the output of layout$render: body(ggplot2:::ggplot_gtable.ggplot_built)[[8]] legend_box <- plot$guides$assemble(theme) This is the load-bearing step that computes/defines a bunch of smaller components internally: ggplot_build(p)$layout$render <ggproto method> <Wrapper function> function (...) render(..., self = self) <Inner function (f)> function (self, panels, data, theme, labels) { facet_bg <- self$facet$draw_back(data, self$layout, self$panel_scales_x, self$panel_scales_y, theme, self$facet_params) facet_fg <- self$facet$draw_front(data, self$layout, self$panel_scales_x, self$panel_scales_y, theme, self$facet_params) panels <- lapply(seq_along(panels[[1]]), function(i) { panel <- lapply(panels, `[[`, i) panel <- c(facet_bg[i], panel, facet_fg[i]) coord_fg <- self$coord$render_fg(self$panel_params[[i]], theme) coord_bg <- self$coord$render_bg(self$panel_params[[i]], theme) if (isTRUE(theme$panel.ontop)) { panel <- c(panel, list(coord_bg), list(coord_fg)) } else { panel <- c(list(coord_bg), panel, list(coord_fg)) } ggname(paste("panel", i, sep = "-"), gTree(children = inject(gList(!!!panel)))) }) plot_table <- self$facet$draw_panels(panels, self$layout, self$panel_scales_x, self$panel_scales_y, self$panel_params, self$coord, data, theme, self$facet_params) labels <- self$coord$labels(list(x = self$resolve_label(self$panel_scales_x[[1]], labels), y = self$resolve_label(self$panel_scales_y[[1]], labels)), self$panel_params[[1]]) labels <- self$render_labels(labels, theme) self$facet$draw_labels(plot_table, self$layout, self$panel_scales_x, self$panel_scales_y, self$panel_params, self$coord, data, theme, labels, self$params) } We can inspect these individual components: layout_render_env <- ggtrace_capture_env(p, ggplot2:::Layout$render) # grob in between the Coord's background and the layer for each panel layout_render_env$facet_bg [[1]] zeroGrob[NULL] [[2]] zeroGrob[NULL] # grob in between the Coord's foreground and the layer for each panel layout_render_env$facet_fg [[1]] zeroGrob[NULL] [[2]] zeroGrob[NULL] # individual panels (integrating the bg/fg) layout_render_env$panels [[1]] gTree[panel-1.gTree.29182] [[2]] gTree[panel-2.gTree.29196] # panels assembled into a gtable layout_render_env$plot_table TableGrob (4 x 7) "layout": 12 grobs z cells name grob 1 1 (3-3,2-2) panel-1-1 gTree[panel-1.gTree.29182] 2 1 (3-3,6-6) panel-2-1 gTree[panel-2.gTree.29196] 3 3 (1-1,2-2) axis-t-1-1 zeroGrob[NULL] 4 3 (1-1,6-6) axis-t-2-1 zeroGrob[NULL] 5 3 (4-4,2-2) axis-b-1-1 absoluteGrob[GRID.absoluteGrob.29199] 6 3 (4-4,6-6) axis-b-2-1 absoluteGrob[GRID.absoluteGrob.29199] 7 3 (3-3,5-5) axis-l-1-2 zeroGrob[NULL] 8 3 (3-3,1-1) axis-l-1-1 absoluteGrob[GRID.absoluteGrob.29205] 9 3 (3-3,7-7) axis-r-1-2 zeroGrob[NULL] 10 3 (3-3,3-3) axis-r-1-1 zeroGrob[NULL] 11 2 (2-2,2-2) strip-t-1-1 gtable[strip] 12 2 (2-2,6-6) strip-t-2-1 gtable[strip] # individual labels drawn before being added to gtable and returned layout_render_env$labels $x $x[[1]] zeroGrob[NULL] $x[[2]] titleGrob[axis.title.x.bottom..titleGrob.29255] $y $y[[1]] titleGrob[axis.title.y.left..titleGrob.29258] $y[[2]] zeroGrob[NULL] 18.3.1.1 Sneak peak: The rest of the gtable step is just updating this plot_table object. all_plot_table_versions <- ggtrace_inspect_vars( x = p, method = ggplot2:::ggplot_gtable.ggplot_built, at = "all", vars = "plot_table" ) names(all_plot_table_versions) [1] "Step8" "Step10" "Step22" "Step23" "Step24" "Step25" "Step26" "Step27" [9] "Step28" "Step29" "Step30" "Step31" "Step32" "Step33" "Step34" lapply(seq_along(all_plot_table_versions), function(i) { ggsave(tempfile(sprintf("plot_table_%02d_", i), fileext = ".png"), all_plot_table_versions[[i]]) }) dir(tempdir(), "plot_table_.*png", full.names = TRUE) %>% magick::image_read() %>% magick::image_annotate(names(all_plot_table_versions), location = "+1050+0", size = 100) %>% magick::image_write_gif("images/plot_table_animation1.gif", delay = .5) plot_table_animation1 all_plot_table_versions2 <- ggtrace_inspect_vars( x = p + labs( subtitle = "This is a subtitle", caption = "@yjunechoe", tag = "A" ) , method = ggplot2:::ggplot_gtable.ggplot_built, at = "all", vars = "plot_table" ) identical(names(all_plot_table_versions), names(all_plot_table_versions2)) lapply(seq_along(all_plot_table_versions2), function(i) { ggsave(tempfile(sprintf("plot_table2_%02d_", i), fileext = ".png"), all_plot_table_versions2[[i]]) }) dir(tempdir(), "plot_table2_.*png", full.names = TRUE) %>% magick::image_read() %>% magick::image_annotate(names(all_plot_table_versions), location = "+1050+0", size = 100) %>% magick::image_write_gif("images/plot_table_animation2.gif", delay = .5) plot_table_animation2 18.3.2 Adding guides The legend (legend_box) is first defined in Step 11: body(ggplot2:::ggplot_gtable.ggplot_built)[[11]] title_height <- grobHeight(title) legend_box <- ggtrace_inspect_vars( x = p, method = ggplot2:::ggplot_gtable.ggplot_built, at = 12, vars = "legend_box" ) grid.newpage() grid.draw(legend_box) It then undergoes some edits/tweaks, including resolving the legend.position theme setting, and then finally gets added to the plot in Step 15: body(ggplot2:::ggplot_gtable.ggplot_built)[[15]] caption_height <- grobHeight(caption) p_with_legend <- ggtrace_inspect_vars( x = p, method = ggplot2:::ggplot_gtable.ggplot_built, at = 16, vars = "plot_table" ) grid.newpage() grid.draw(p_with_legend) The bulk of the work was done in Step 11, with the build_guides() function. That in turn calls guides_train() and guides_gengrob() which in turn calls guide_train() and guide_gengrob for each scale (including positional aesthetics like x and y). Why scale? The scale is actually what holds information about guide. They’re two sides of the same coin - the scale translates the underlying data to some defined space, and the guide reverses that (translates a space to data). One’s for drawing, the other is for reading. This is also why all scale_*() functions take a guide argument. Positional scales use guide_axis() as default, and non-positional scales use guide_legend() as default. class(guide_legend()) [1] "GuideLegend" "Guide" "ggproto" "gg" # This explicitly spells out the default p + scale_color_discrete(guide = guide_legend()) This is the output of the guide_train() method defined for guide_legend(). The most important piece of it is key, which is the data associated with the legend. # TODO: The unexported function no longer exists, so we had to turn off eval. names( ggtrace_inspect_return(p, ggplot2:::guide_train.legend) ) ggtrace_inspect_return(p, ggplot2:::guide_train.legend)$key The output of guide_train() is passed to guide_gengrob(). This is the output of the guide_gebgrob() method defined for guide_legend(): # TODO: The unexported function no longer exists, so we had to turn off eval. legend_gengrob <- ggtrace_inspect_return(p, ggplot2:::guide_gengrob.legend) grid.newpage() grid.draw(legend_gengrob) 18.3.3 Adding adornment It’s everything else after the legend step that we saw in the gifs above. It looks trivial but this step we’re glossing over is ~150 lines of code. But it’s not super complicated - just a lot of if-else statements and a handful of low-level {grid} and {gtable} functions. 18.3.4 Output To put it all together: p_built <- ggplot_build(p) p_gtable <- ggplot_gtable(p_built) grid.newpage() grid.draw(p_gtable) "],["introducing-ggproto.html", "18.4 Introducing ggproto", " 18.4 Introducing ggproto It’s essentially a list of functions String <- list( add = function(x, y) paste0(x, y), subtract = function(x, y) gsub(y, "", x, fixed = TRUE), show = function(x, y) paste0(x, " and ", y) ) Number <- list( add = function(x, y) x + y, subtract = function(x, y) x - y, show = String$show ) String$add("a", "b") [1] "ab" String$subtract("june", "e") [1] "jun" String$show("ggplot", "bookclub") [1] "ggplot and bookclub" Number$add(1, 2) [1] 3 Number$subtract(10, 5) [1] 5 Number$show(1, 2) [1] "1 and 2" 18.4.1 ggproto syntax From the book: Person <- ggproto("Person", NULL, first = "", last = "", birthdate = NA, full_name = function(self) { paste(self$first, self$last) }, age = function(self) { days_old <- Sys.Date() - self$birthdate floor(as.integer(days_old) / 365.25) }, description = function(self) { paste(self$full_name(), "is", self$age(), "old") } ) 18.4.2 ggproto style guide Kind of dense - can read through on your own but most can be picked up as we read the rest of the book. "],["meeting-videos-18.html", "18.5 Meeting Videos", " 18.5 Meeting Videos 18.5.1 Cohort 1 Meeting chat log 00:58:20 Ryan S: so sorry that I have to drop off in a second. I'll look for the remaining few minutes on the video. 00:58:28 Ryan S: Thanks, June! Meeting chat log 00:49:23 June Choe: https://yjunechoe.github.io/ggtrace-user2022 00:54:24 June Choe: https://github.com/EvaMaeRey/mytidytuesday/blob/master/2022-01-03-easy-geom-recipes/easy_geom_recipes.Rmd 00:59:31 June Choe: https://www.rstudio.com/resources/rstudioconf-2020/extending-your-ability-to-extend-ggplot2/ "],["extending-ggplot2.html", "Chapter 19 Extending ggplot2", " Chapter 19 Extending ggplot2 Learning objectives: How to overcome the challenge of a particular plot Learn how to extend ggplot2 in different ways "],["overview.html", "19.1 Overview", " 19.1 Overview In this chapter we see how to extend the graphics of a plot, in particular we will see how the following layers and other key part of a plot are composed, and where the changes can be applied. Themes Stats Geoms Coords Scales Positions Facets Guides "],["themes-1.html", "19.2 Themes", " 19.2 Themes How about creating new theme elements? The base is theme is theme_grey(), then here is an example of the modification made on theme_bw() to obtain theme_minimal(). The theme_* is the easiest part of a plot to be modified. Use the print() function on a theme_<…> to see its specifications, the %+replace% operator shows where the substitutions have taken place. print(theme_grey) print(theme_minimal) and %+replace% operator print(theme_bw) function (base_size = 11, base_family = "", base_line_size = base_size/22, base_rect_size = base_size/22) { theme_grey(base_size = base_size, base_family = base_family, base_line_size = base_line_size, base_rect_size = base_rect_size) %+replace% theme(panel.background = element_rect(fill = "white", colour = NA), panel.border = element_rect(fill = NA, colour = "grey20"), panel.grid = element_line(colour = "grey92"), panel.grid.minor = element_line(linewidth = rel(0.5)), strip.background = element_rect(fill = "grey85", colour = "grey20"), complete = TRUE) } <bytecode: 0x5567a9987c60> <environment: namespace:ggplot2> print(theme_minimal) function (base_size = 11, base_family = "", base_line_size = base_size/22, base_rect_size = base_size/22) { theme_bw(base_size = base_size, base_family = base_family, base_line_size = base_line_size, base_rect_size = base_rect_size) %+replace% theme(axis.ticks = element_blank(), legend.background = element_blank(), legend.key = element_blank(), panel.background = element_blank(), panel.border = element_blank(), strip.background = element_blank(), plot.background = element_blank(), complete = TRUE) } <bytecode: 0x5567a2964c20> <environment: namespace:ggplot2> While if we call the function: print(theme_minimal()), we can see all the options set available. In general, if you want to make a modification to an existing theme, the general approach is to simply use the theme() function while setting complete = TRUE. "],["stats-1.html", "19.3 Stats", " 19.3 Stats Extending stats is one of the most useful ways to extend the capabilities of ggplot2 Stats are purely about data transformations Creating new stats stat with these extension functions: compute_*() setup_*() The logic of a stat is made of subsequent calls: In general the transformation is done to single group starting at the compute_group() level. Before compute_*() calls are thesetup_*() functions which allows the Stat to react and modify itself in response to the given parameters. Sometimes, with related stats, all that is necessary is to make a subclass and provide new setup_params()/setup_data() methods. print(stat_bin()) geom_bar: na.rm = FALSE, orientation = NA stat_bin: binwidth = NULL, bins = NULL, center = NULL, boundary = NULL, breaks = NULL, closed = c("right", "left"), pad = FALSE, na.rm = FALSE, orientation = NA position_stack "],["geoms-2.html", "19.4 Geoms", " 19.4 Geoms Why making a new geom_? not meaningful data by any current geoms combination of the output of multiple geoms needs for grobs not currently available from existing geoms. The logic of a geom is made of subsequent calls: Implementation is easier for draw_group() setup_params()+setup_data() overwriting the setup_data() Example Reparameterisation of geom_segment() with geom_spoke() print(GeomSpoke$setup_data) <ggproto method> <Wrapper function> function (...) setup_data(...) <Inner function (f)> function (data, params) { data$radius <- data$radius %||% params$radius data$angle <- data$angle %||% params$angle transform(data, xend = x + cos(angle) * radius, yend = y + sin(angle) * radius) } Example geom_smooth() as a combination of geom_line() and geom_ribbon() preparing the data for each of the geoms inside the draw_*() print(GeomSmooth$draw_group) <ggproto method> <Wrapper function> function (...) draw_group(...) <Inner function (f)> function (data, panel_params, coord, lineend = "butt", linejoin = "round", linemitre = 10, se = FALSE, flipped_aes = FALSE) { ribbon <- transform(data, colour = NA) path <- transform(data, alpha = NA) ymin = flipped_names(flipped_aes)$ymin ymax = flipped_names(flipped_aes)$ymax has_ribbon <- se && !is.null(data[[ymax]]) && !is.null(data[[ymin]]) gList(if (has_ribbon) GeomRibbon$draw_group(ribbon, panel_params, coord, flipped_aes = flipped_aes), GeomLine$draw_panel(path, panel_params, coord, lineend = lineend, linejoin = linejoin, linemitre = linemitre)) } "],["coords.html", "19.5 Coords", " 19.5 Coords Example: CoordCartesian rescaling the position data Coords takes care of rendering the axes, axis labels, and panel foreground and background and it can intercept both the layer data and facet layout and modify it, with: draw_*() transform() Example print(CoordCartesian$transform) <ggproto method> <Wrapper function> function (...) transform(...) <Inner function (f)> function (data, panel_params) { data <- transform_position(data, panel_params$x$rescale, panel_params$y$rescale) transform_position(data, squish_infinite, squish_infinite) } print(coord_sf) "],["scales-1.html", "19.6 Scales", " 19.6 Scales Example Build a wrapper for a new palette to an existing scale. This is done by providing a new palette scale into the relevant basic scale. print(scale_fill_viridis_c) function (name = waiver(), ..., alpha = 1, begin = 0, end = 1, direction = 1, option = "D", values = NULL, space = "Lab", na.value = "grey50", guide = "colourbar", aesthetics = "fill") { continuous_scale(aesthetics, name = name, palette = pal_gradient_n(pal_viridis(alpha, begin, end, direction, option)(6), values, space), na.value = na.value, guide = guide, ...) } <bytecode: 0x5567a2dcb698> <environment: namespace:ggplot2> "],["other-important-parts.html", "19.7 Other important parts", " 19.7 Other important parts 19.7.1 Positions The Position class is slightly simpler than the other ggproto classes. 19.7.2 Facets Look at FacetWrap or FacetGrid, and simply provide new compute_layout(), and map_data() methods 19.7.3 Guides What is a ggproto? The answer is back in chapter20 ggplot2 internals "],["references-1.html", "19.8 References", " 19.8 References Extending ggplot2 A List of ggplot2 extensions ggplot Extension Course "],["meeting-videos-19.html", "19.9 Meeting Videos", " 19.9 Meeting Videos 19.9.1 Cohort 1 Meeting chat log 00:10:52 June Choe: I'm fine with anything! 00:39:47 Federica Gazzelloni: - [Extending ggplot2](https://ggplot2.tidyverse.org/articles/extending-ggplot2.html) - [A List of ggplot2 extensions](https://exts.ggplot2.tidyverse.org/) - [ggplot Extension Course](https://mq-software-carpentry.github.io/r-ggplot-extension/aio.html) - [Example](https://github.com/EvaMaeRey/mytidytuesday/blob/master/2022-01-03-easy-geom-recipes/easy_geom_recipes.Rmd) - [extending-your-ability-to-extend-ggplot2](https://www.rstudio.com/resources/rstudioconf-2020/extending-your-ability-to-extend-ggplot2/) - [ggtrace](https://yjunechoe.github.io/ggtrace-user2022/#/title-slide) "],["a-case-study.html", "Chapter 20 A case study", " Chapter 20 A case study Learning objectives: {These are nice to have, but take some extra work. It’s ok to skip these if necessary.} "],["slide-1-title.html", "20.1 {Slide 1 title}", " 20.1 {Slide 1 title} {Create slides as sections marked with ##, but keep them short like a slide.} "],["meeting-videos-20.html", "20.2 Meeting Videos", " 20.2 Meeting Videos 20.2.1 Cohort 1 Meeting chat log "],["mastering-the-grammar.html", "Chapter 21 Mastering the Grammar", " Chapter 21 Mastering the Grammar This was previously covered as chapter 13, but it does not exist as a separate chapter in the current version of the book. Learning Objectives Review the elements and benefits of the grammar of graphics Be able to break down simple graphics into its component parts Mapping Coordinates; define and itdentify layer and scaling as well as coordinate and faceting Create a process for integrating the grammar into your visual design Apply the grammar to the analysis of existing graphics. References Wickham, H. (2010). A layered grammar of graphics . Journal of Computational and Graphical Statistics, Volume 19, Number 1, 3–28. "],["introduction-6.html", "21.1 Introduction", " 21.1 Introduction Definition of a grammar: “the fundamental principles or rules of an art or science” (OED Online 1989). “In order to unlock the full power of ggplot2, you’ll need to master the underlying grammar. By understanding the grammar, and how its components fit together, you can create a wider range of visualizations, combine multiple sources of data, and customise to your heart’s content.” “The next chapters discuss the components in more detail, and provide more examples of how you can use them in practice.” Grammar versus chart heuristics. Often we match data type to a standard chart type (for example: bar chart for categorical comparisions). 4 parts of a Layer Data and aesthetic mapping. “Along with the data, we need a specification of which variables are mapped to which aesthetics.” (Wickham, 2010, p. 10) Stat. “A statistical transformation, or stat, transforms the data, typically by summarizing them in some manner…A statistical transformation, or stat, transforms the data, typically by summarizing them in some manner.” (Wickham, 2010, p. 10) Geom. “Geometric objects, or geoms for short, control the type of plot that you create. For example, using a point geom will create a scatterplot, whereas using a line geom will create a line plot. We can classify geoms by their dimensionality: • 0d: point, text, • 1d: path, line (ordered path), • 2d: polygon, interval.” (Wickham, 2010, p. 11) Position adjustment Examples include geom_jitter or how bar plots adjust so the lines do not overlap. Review of key terms Geom: point, bar, boxplot, line Aesthetics: size, color, shape, position Aesthetics finder Benefits of using the Grammar Allows one to iterate in the creation and/or updating of a plot. Gives a language for viewing, and learning from, existing data viz. Enables a better process by focusing the viz developer on the intended purpose of the visual/analysis (not just matching a chart to data). Expands data viz beyond just how to use this particular software syntax. "],["building-a-scatterplot.html", "21.2 Building a scatterplot", " 21.2 Building a scatterplot ggplot(mpg, aes(displ, hwy, colour = factor(cyl))) + geom_line() + theme(legend.position = "none") ggplot(mpg, aes(displ, hwy, colour = factor(cyl))) + geom_bar(stat = "identity", position = "identity", fill = NA) + theme(legend.position = "none") ggplot(mpg, aes(displ, hwy, colour = factor(cyl))) + geom_point() + geom_smooth(method = "lm") + labs(title = "What type of graph would you call this?", subtitle = "Notice the defaults of ggplot2") + theme(plot.title = element_text(size = 15, color = "firebrick", face = "bold", hjust = .5)) + theme(plot.subtitle = element_text(hjust = .5)) "],["scaling.html", "21.3 Scaling", " 21.3 Scaling “The values in the previous table have no meaning to the computer. We need to convert them from data units (e.g., litres, miles per gallon and number of cylinders) to graphical units (e.g., pixels and colours) that the computer can display. This conversion process is called scaling and performed by scales.” what we see to what the computer reads we see colours; computer reads hexadecimal string we see size; computer reads a number we see shapes; the computer reads an integer Example in Page 4-6 of Wickham, H. (2010) “Scales typically map from a single variable to a single aesthetic, but there are exceptions. For example, we can map one variable to hue and another to saturation, to create a single aesthetic, color. We can also create redundant mappings, mapping the same variable to multiple aesthetics.” (Wickham, 2010, p. 13) These aesthetic specifications that are meaningful to R are described in vignette(“ggplot2-specs”) Shape Shapes take five types of values: An integer in [0,25]: shapes <- data.frame( shape = c(0:19, 22, 21, 24, 23, 20), x = 0:24 %/% 5, y = -(0:24 %% 5) ) ggplot(shapes, aes(x, y)) + geom_point(aes(shape = shape), size = 5, fill = "red") + geom_text(aes(label = shape), hjust = 0, nudge_x = 0.15) + scale_shape_identity() + expand_limits(x = 4.1) + theme_void() Line type Line types can be specified with: An integer or name: 0 = blank, 1 = solid, 2 = dashed, 3 = dotted, 4 = dotdash, 5 = longdash, 6 = twodash, as shown below: lty <- c("solid", "dashed", "dotted", "dotdash", "longdash", "twodash") linetypes <- data.frame( y = seq_along(lty), lty = lty ) ggplot(linetypes, aes(0, y)) + geom_segment(aes(xend = 5, yend = y, linetype = lty)) + scale_linetype_identity() + geom_text(aes(label = lty), hjust = 0, nudge_y = 0.2) + scale_x_continuous(NULL, breaks = NULL) + scale_y_reverse(NULL, breaks = NULL) Font face There are only three fonts that are guaranteed to work everywhere: “sans” (the default), “serif”, or “mono”: df <- data.frame(x = 1, y = 3:1, family = c("sans", "serif", "mono")) ggplot(df, aes(x, y)) + geom_text(aes(label = family, family = family)) Colour and fill Note that shapes 21-24 have both stroke colour and a fill. The size of the filled part is controlled by size, the size of the stroke is controlled by stroke. Each is measured in mm, and the total size of the point is the sum of the two. Note that the size is constant along the diagonal in the following figure. sizes <- expand.grid(size = (0:3) * 2, stroke = (0:3) * 2) ggplot(sizes, aes(size, stroke, size = size, stroke = stroke)) + geom_abline(slope = -1, intercept = 6, colour = "white", size = 6) + geom_point(shape = 21, fill = "red") + scale_size_identity() Horizontal and vertical justification have the same parameterisation, either a string (“top”, “middle”, “bottom”, “left”, “center”, “right”) or a number between 0 and 1: top = 1, middle = 0.5, bottom = 0 left = 0, center = 0.5, right = 1 just <- expand.grid(hjust = c(0, 0.5, 1), vjust = c(0, 0.5, 1)) just$label <- paste0(just$hjust, ", ", just$vjust) ggplot(just, aes(hjust, vjust)) + geom_point(colour = "grey70", size = 5) + geom_text(aes(label = label, hjust = hjust, vjust = vjust)) "],["adding-complexity-faceting-coordinates-hierarchy-of-defaults.html", "21.4 Adding complexity; faceting, coordinates, hierarchy of defaults", " 21.4 Adding complexity; faceting, coordinates, hierarchy of defaults facets, multiple layers and statistics ggplot(mpg, aes(displ, hwy)) + geom_point() + geom_smooth() + facet_wrap(~year) “A coordinate system, coord for short, maps the position of objects onto the plane of the plot. Position is often specified by two coordinates (x, y), but could be any number of coordinates. The Cartesian coordinate system is the most common coordinate system for two dimensions, whereas polar coordinates and various map projections are used less frequently.” (Wickham, 2010, p. 13) “Coordinate systems affect all position variables simultaneously and differ from scales in that they also change the appearance of the geometric objects. For example, in polar coordinates, bar geoms look like segments of a circle. Additionally, scaling is performed before statistical transformation, whereas coordinate transformations occur afterward.” (Wickham, 2010, p. 13) Coord_polar ggplot(mpg, aes(displ, hwy, colour = factor(cyl))) + geom_bar(stat = "identity", position = "identity", fill = NA) + theme(legend.position = "none") + coord_polar() “The angle component is particularly useful for cyclical data because the starting and ending points of a single cycle are adjacent. Common cyclical variables are components of dates, like days of the year or hours of the day, and angles, like wind direction.” (Wickham, 2010, p. 22) “In the grammar, a pie chart is a stacked bar geom drawn in a polar coordinate system.” (Wickham, 2010, p. 22) ggplot(diamonds,aes(x = "", fill=clarity)) + geom_bar(width = 1) + coord_polar (theta="y") Figure 15 shows this, as well as a bullseye plot, which arises when we map the height to radius instead of angle. (Wickham, 2010, p. 22) ggplot(diamonds,aes(x = "", fill=clarity)) + geom_bar(width = 1) + coord_polar (theta="x") The Coxcomb plot is a bar chart in polar coordinates. Note that the categories abut in the Coxcomb, but are separated in the bar chart: this is an example of a graphical convention that differs in different coordinate systems. (Wickham, 2010, p. 23) library(patchwork) a <- ggplot(diamonds,aes(x = clarity, fill=clarity)) + geom_bar(width = 1) + theme(legend.position = "none") b <- ggplot(diamonds,aes(x = clarity, fill=clarity)) + geom_bar(width = 1) + coord_polar (theta="y") + theme(legend.position = "none") a + b Defaults The full ggplot2 specification of the scatterplot of price versus weight is: ggplot() + layer( data = diamonds, mapping = aes(x = carat, y = price), geom = "point", stat = "identity", position = "identity" ) + scale_y_continuous() + scale_x_continuous() + coord_cartesian() "],["process-and-examples.html", "21.5 Process and Examples", " 21.5 Process and Examples Process Start with business or research question and purpose Write out grammar Think through chart types, geom options Iterate In the Jan 3, 2022 video, Statistical Rethinking 2022 Lecture 01 Richard McElreath describes a research process (see 19 minute mark): Theoretical Estimand The Scientific (causal) model(s) Use 1 & 2 to build statistical model(s) Simulate from 2 to validate 3 yields 1 Analze real data Does this translate to a data viz process? Apply the grammar to data viz examples The chapter gives 7 examples inclinding “Napoleon’s march” by Charles John Minard which is also covered in the A Layered Grammar of Graphics article. We will look at examples from here: Our 51 Best (And Weirdest) Charts Of 2021 by FiveThirtyEight Staff (Published Dec. 20, 2021) Resources Wickham, H. (2010). A layered grammar of graphics . Journal of Computational and Graphical Statistics, Volume 19, Number 1, 3–28. Chapter 2 of Fundamentals of Data Visualization by Claus O. Wilke gives an overview of Mapping data onto aesthetics and chapter 3 is on Coordinate systems and axes. "],["meeting-videos-21.html", "21.6 Meeting Videos", " 21.6 Meeting Videos 21.6.1 Cohort 1 "],["404.html", "Page not found", " Page not found The page you requested cannot be found (perhaps it was moved or renamed). You may want to try searching to find the page's new location, or use the table of contents to find the page you are looking for. "]]
+[["index.html", "ggplot2 Book Club Welcome", " ggplot2 Book Club The Data Science Learning Community 2024-08-06 Welcome This is a companion for the book ggplot2: Elegant Graphics for Data Analysis by Hadley Wickham, Danielle Navarro, and Thomas Lin Pedersen. This companion is available at https://r4ds.github.io/bookclub-ggplot2. This website is being developed by the Data Science Learning Community. Follow along, and join the community to participate. This companion follows the Data Science Learning Community Code of Conduct. "],["book-club-meetings.html", "Book club meetings", " Book club meetings Each week, a volunteer will present a chapter from the book. This is the best way to learn the material. Presentations will usually consist of a review of the material, a discussion, and/or a demonstration of the principles presented in that chapter. More information about how to present is available in the github repo. Presentations will be recorded, and will be available on the Data Science Learning Community YouTube Channel. "],["introduction.html", "Introduction", " Introduction Learning objectives: Introduce yourself! Determine whether this club is for you. We will go over the different sections of the book. "],["hi-my-name-is.html", "Hi, my name is…", " Hi, my name is… Camera on or raise your hand if you’re willing to introduce yourself! Name Location and/or timezone Any previous DSLC clubs? Why are you here? "],["present-a-chapter.html", "Present a chapter!", " Present a chapter! Each member of the book club will have the opportunity to lead a chapter. We recommend the following format: Use the slides. Try to not use the book, remember we only have one hour! However, sometimes it could be useful to jump into RStudio and have the code ready to create a graph. Remember to increase font size to 14 (at least) by going to Tools > Global Options > Appearance > Editor font size Try to keep all the content in one visible slide. Pick chapters that interest you either because it’s content you know but would like to learn more of or chapters of things you want to get better at. Follow the How to present instructions on the GitHub README for this Book Club Start each session with start in the comments and end the session with end Introduce the chapter by saying the name of the book we are reading, the cohort, the chapter, and your name. If the book has exercises or you have a specific question regarding something about the chapter, then make sure you have the code ready in RStudio so we can go over this in the last 10 min of the hour. "],["remember-tidytuesday.html", "Remember #TidyTuesday", " Remember #TidyTuesday #TidyTuesday is a great source to keep handy when you are trying to learn about ggplot2. You can follow the hashtag on X, Mastodon, or BlueSky and find other researchers posting links to their GitHub repos. I have learned a lot by studying these repos. "],["welcome-to-ggplot2.html", "Welcome to ggplot2", " Welcome to ggplot2 ggplot2 has an underlying grammar, based on the Grammar of Graphics (Wilkinson 2005), that allows you to compose graphs by combining independent components. You can produce publication-quality graphics in seconds. However, ggplot2’s comprehensive themeing system makes it easy to do what you want. ggplot2 is designed to work iteratively. You start with a layer that shows the raw data. Then you add layers of annotations and statistical summaries. "],["grammar-of-graphics.html", "Grammar of graphics", " Grammar of graphics The grammar tells us that a graphic maps the data to the aesthetic attributes (color, shape, size) of geometric objects (points, lines, bars). The plot may also include statistical transformations of the data and information about the plot’s coordinate system. Faceting can be used to plot for different subsets of the data. The combination of these independent components are what make up a graphic. "],["mapping-components.html", "Mapping components", " Mapping components Plots are composed of the data, the information you want to visualise, and a mapping, the description of how the data’s variables are mapped to aesthetic attributes. There are five mapping components: Layer is a collection of geometric elements and statistical transformations. Geoms for short. Scale: maps values in the data space to values in the aesthetic space. Coord: coordinate system, describes data coordinates to the plane of the graphic. Facet: specifies how to break up and display subsets of data as small multiples. Theme: controls the finer points of display. "],["about-this-book.html", "About this book", " About this book Chapter 2: This chapter introduces several important ggplot2 concepts: geoms, aesthetic mappings and facetting. Chapter 3-9: explore how to use the basic toolbox to solve a wide range of visualisation problems that you’re likely to encounter in practice. Chapter 10-12: show you how to control the most important scales, allowing you to tweak the details of axes and legends. Chapter 13: demonstrates how to add additional layers to your plot, exercising full control over the geoms and stats used within them. Chapter 10-12: will show you what scales are available, how to adjust their parameters, and how to control the appearance of axes and legends. Section 13.7: Faceting is a very powerful graphical tool as it allows you to rapidly compare different subsets of your data. Chapter 17: you will learn about how to control the theming system of ggplot2 and how to save plots to disk. "],["prerequisites.html", "Prerequisites", " Prerequisites install.packages(c( "colorBlindness", "directlabels", "dplyr", "ggforce", "gghighlight", "ggnewscale", "ggplot2", "ggraph", "ggrepel", "ggtext", "ggthemes", "hexbin", "Hmisc", "mapproj", "maps", "munsell", "ozmaps", "paletteer", "patchwork", "rmapshaper", "scico", "seriation", "sf", "stars", "tidygraph", "tidyr", "wesanderson" )) "],["meeting-videos.html", "Meeting Videos", " Meeting Videos 0.0.1 Cohort 1 Meeting chat log 00:23:18 Michael Haugen: Could we do 12:30pm CST? I have a meeting until then. 00:25:25 Michael Haugen: Thanks! 00:45:10 Kent Johnson: GitHub repo:https://r4ds.github.io/bookclub-ggplot2/ "],["first-steps.html", "Chapter 1 First Steps ", " Chapter 1 First Steps "],["general-housekeeping-items.html", "1.1 General Housekeeping Items", " 1.1 General Housekeeping Items This is a learning opportunity so feel free to ask any question at any time. Take time to learn the theory, in particular Grammar of Graphics. Please do the chapter exercises. Second-best learning opportunity! Please plan to facilitate one of the discussions. Best learning opportunity! "],["learning-objectives.html", "1.2 Learning Objectives", " 1.2 Learning Objectives Brief introduction to ggplot’s capabilities Learn about key components of every plot: data, aesthetics, geoms Learn about faceting See a few different geoms Modify the axes Save the plot to disk "],["introduction-1.html", "1.3 Introduction", " 1.3 Introduction Leland Wilkinson (Grammar of Graphics, 1999) formalized two main principles in his plotting framework: Graphics = distinct layers of grammatical elements Meaningful plots through aesthetic mappings The essential grammatical elements to create any visualization with {ggplot2} are: "],["main-data-set.html", "1.4 Main data set", " 1.4 Main data set For this chapter, we’ll mainly use the mpg dataset that comes with ggplot. mpg ## # A tibble: 234 × 11 ## manufacturer model displ year cyl trans drv cty hwy fl class ## <chr> <chr> <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr> ## 1 audi a4 1.8 1999 4 auto… f 18 29 p comp… ## 2 audi a4 1.8 1999 4 manu… f 21 29 p comp… ## 3 audi a4 2 2008 4 manu… f 20 31 p comp… ## 4 audi a4 2 2008 4 auto… f 21 30 p comp… ## 5 audi a4 2.8 1999 6 auto… f 16 26 p comp… ## 6 audi a4 2.8 1999 6 manu… f 18 26 p comp… ## 7 audi a4 3.1 2008 6 auto… f 18 27 p comp… ## 8 audi a4 quattro 1.8 1999 4 manu… 4 18 26 p comp… ## 9 audi a4 quattro 1.8 1999 4 auto… 4 16 25 p comp… ## 10 audi a4 quattro 2 2008 4 manu… 4 20 28 p comp… ## # ℹ 224 more rows cty and hwy are miles per gallon measures displ is engine displacement in litres drv is front wheel (f), rear wheel (r) or four wheel (4) model is the model of the car class is two-seater, SUV, compact, etc. "],["components-of-every-plot.html", "1.5 Components of every plot", " 1.5 Components of every plot Three components ggplot(mpg, aes(x = displ, y = hwy)) + geom_point() It’s allowable to omit the x = and y = arguments of aes. In other words, aes(displ, hwy) would be valid for this plot. "],["other-aesthetic-attributes.html", "1.6 Other aesthetic attributes", " 1.6 Other aesthetic attributes color, shape and size can be mapped to variables in the data The class variable of the mpg dataset has seven unique values. The plot can assign a specific color to each value by mapping class to color within the aesthetic function. ## # A tibble: 7 × 1 ## class ## <chr> ## 1 compact ## 2 midsize ## 3 suv ## 4 2seater ## 5 minivan ## 6 pickup ## 7 subcompact ggplot(mpg, aes(displ, hwy, color = class)) + geom_point() Including a color assignment outside the aesthetic of the geometry layer will make all of the points that color. ggplot(mpg, aes(displ, hwy)) + geom_point(color = "blue") Mapping a variable to shape and color adds some diversity and information to the plot. ggplot(mpg, aes(displ, hwy, shape = drv, color = drv)) + geom_point() Mapping a variable to size can also add some new insights. ggplot(mpg, aes(manufacturer, drv, size = displ)) + geom_point() + theme(axis.text.x = element_text(angle = 90)) "],["faceting.html", "1.7 Faceting", " 1.7 Faceting Faceting creates graphics by splitting the data into subsets and displaying the same graph for each subset. Really helpful if there are lots of values, making color/shape less meaningful. ggplot(mpg, aes(displ, hwy)) + geom_point() + facet_wrap(~class) Exercise: Use faceting to explore the three-way relationship between fuel economy, engine size and number of cylinders. How does faceting by number of cylinders change your assessment of the relationship between engine size and fuel economy? "],["geoms.html", "1.8 Geoms", " 1.8 Geoms The geom_point() geom gives a familiar scatterplot. Other geoms include: geom_smooth() which fits a smooth line to the data check help to see geom_smooth’s arguments like method, se or span. ggplot(mpg, aes(displ, hwy)) + geom_point() + geom_smooth() geom_boxplot() which generates a box-and-whisker plot check help to see geom_boxplot’s arguments like outlier arguments, and coef which adjusts the whisker length. ggplot(mpg, aes(drv, hwy)) + geom_boxplot() consider boxplot variants like geom_jitter and geom_violin ggplot(mpg, aes(drv, hwy)) + geom_jitter() ggplot(mpg, aes(drv, hwy)) + geom_violin() geom_histogram which generates a histogram and geom_freqpoly which generates a frequency polygon check help to see geom_histogram’s arguments like position and binwidth. ggplot(mpg, aes(hwy)) + geom_histogram() ## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`. ggplot(mpg, aes(hwy)) + geom_freqpoly() ## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`. geom_bar which generates a bar chart check help to see geom_bar’s arguments like position and width ggplot(diamonds, aes(cut)) + geom_bar() This graph below uses displ for y in the aesthetic and uses the stat of identity so that it sums the total displacement for each manufacturer. ggplot(mpg, aes(manufacturer, displ)) + geom_bar(stat = "identity") This plot now shows the total displacement. mpg %>% group_by(manufacturer) %>% summarize(sum(displ)) ## # A tibble: 15 × 2 ## manufacturer `sum(displ)` ## <chr> <dbl> ## 1 audi 45.8 ## 2 chevrolet 96.2 ## 3 dodge 162 ## 4 ford 113. ## 5 honda 15.4 ## 6 hyundai 34 ## 7 jeep 36.6 ## 8 land rover 17.2 ## 9 lincoln 16.2 ## 10 mercury 17.6 ## 11 nissan 42.5 ## 12 pontiac 19.8 ## 13 subaru 34.4 ## 14 toyota 100. ## 15 volkswagen 60.9 geom_line and geom_path which generates a line chart or path chart (useful for time series data) check help to see geom_line’s arguments like lineend and arrow ggplot(economics, aes(date, unemploy / pop)) + geom_line() ggplot(economics, aes(date, uempmed)) + geom_line() To investigate these plots further, we can draw them on the same plot. year <- function(x) as.POSIXlt(x)$year + 1900 ggplot(economics, aes(unemploy / pop, uempmed)) + geom_path(color = "grey50") + geom_point(aes(color = year(date))) "],["modifying-the-axes.html", "1.9 Modifying the Axes", " 1.9 Modifying the Axes xlab() and ylab() modify the axis labels ggplot(mpg, aes(cty, hwy)) + geom_point(alpha = 1/3) ggplot(mpg, aes(cty, hwy)) + geom_point(alpha = 1/3) + xlab("city driving (mpg)") + ylab("highway driving (mpg)") # remove labels with NULL ggplot(mpg, aes(cty, hwy)) + geom_point(alpha = 1/3) + xlab(NULL) + ylab(NULL) xlim() and ylim() modify the limits of the axes (boundaries) ggplot(mpg, aes(drv, hwy)) + geom_jitter(width = 0.25) ggplot(mpg, aes(drv, hwy)) + geom_jitter(width = 0.25) + xlim("f", "r") + ylim(20, 30) ## Warning: Removed 139 rows containing missing values or values outside the scale range ## (`geom_point()`). "],["output.html", "1.10 Output", " 1.10 Output Save the plot to a variable p <- ggplot(mpg, aes(displ, hwy, color = factor(cyl))) + geom_point() Then print it print(p) Save it to disk ggsave("plot.png", p, width = 5, height = 5) Describe its structure summary(p) ## data: manufacturer, model, displ, year, cyl, trans, drv, cty, hwy, fl, ## class [234x11] ## mapping: x = ~displ, y = ~hwy, colour = ~factor(cyl) ## faceting: <ggproto object: Class FacetNull, Facet, gg> ## compute_layout: function ## draw_back: function ## draw_front: function ## draw_labels: function ## draw_panels: function ## finish_data: function ## init_scales: function ## map_data: function ## params: list ## setup_data: function ## setup_params: function ## shrink: TRUE ## train_scales: function ## vars: function ## super: <ggproto object: Class FacetNull, Facet, gg> ## ----------------------------------- ## geom_point: na.rm = FALSE ## stat_identity: na.rm = FALSE ## position_identity "],["meeting-videos-1.html", "1.11 Meeting Videos", " 1.11 Meeting Videos 1.11.1 Cohort 1 Meeting chat log 00:04:11 Lydia Gibson: Hello! I missed last week but hoping to join weekly moving forward. 00:37:49 June Choe: there's a good cheatsheet for this -- https://ggplot2tor.com/aesthetics 00:54:29 Michael Haugen: One can use geom_col() as well which will work similar to stats = identity 00:58:12 Michael Haugen: section 3.8 in R4DS may be relevant here as well: https://r4ds.had.co.nz/data-visualisation.html 01:07:43 Federica Gazzelloni: thn 01:07:48 June Choe: thanks! "],["individual-geoms.html", "Chapter 2 Individual Geoms", " Chapter 2 Individual Geoms Learning objectives Discuss how geoms are the fundamental building blocks of ggplot2. Draw comparisons between geoms and their associated named plot. Explore each individual geom by reviewing their documentation. "],["the-basics.html", "2.1 The basics", " 2.1 The basics Each geom can be useful by itself. Geoms can be used in ways to construct more complex geoms. The geoms discussed in this chapter are two dimensional (e.g., x and y). All geoms understand color or colour and size aesthetics. Bar, tile, and polygon understand fill. The terms above are all parameters within ggplot2 functions. "],["area-chart-geom_area.html", "2.2 Area chart: geom_area()", " 2.2 Area chart: geom_area() Draws an area plot. A line plot filled to the y-axis. Multiple groups are stacked. ggplot(diamonds, aes(x = price)) + geom_area(stat = "bin") ## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`. ggplot(diamonds, aes(x = price, fill = cut)) + geom_area(stat = "bin") ## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`. "],["bar-chart-geom_bar.html", "2.3 Bar chart: geom_bar()", " 2.3 Bar chart: geom_bar() Makes a bar plot. ggplot(diamonds, aes(cut)) + geom_bar() What’s up with stat = \"identity\"? The default stat is to count values. Setting this parameter leaves the data unchanged. # Why, though? Perhaps I want to do my own aggregation data_diamond_count <- diamonds |> count(cut, name = "count") ggplot(data_diamond_count, aes(cut, count)) + geom_bar(stat = "identity") "],["line-chart-geom_line.html", "2.4 Line chart: geom_line()", " 2.4 Line chart: geom_line() A geom that connects points from left to right. linetype is a useful parameter. Checkout the different linetypes here. Also here ?linetype ggplot(economics, aes(x = date, y = unemploy)) + geom_line() What’s up with geom_path()? Connects points as they appear in order of the data Answer to exercise 2. ggplot(df, aes(c, y)) + geom_path() ggplot(economics, aes(unemploy / pop, psavert)) + geom_path() "],["scatterplot-geom_point.html", "2.5 Scatterplot: geom_point()", " 2.5 Scatterplot: geom_point() ggplot(mpg, aes(x = displ, y = hwy)) + geom_point() The shape parameter is useful here. interested in the different shapes? ?shape ggplot(mpg, aes(x = displ, y = hwy, shape = factor(cyl))) + geom_point() "],["polygons-geom_polygon.html", "2.6 Polygons: geom_polygon()", " 2.6 Polygons: geom_polygon() Draws polygons, which are filled paths. Useful when making maps: more in Chapter 6. ggplot(df, aes(c, y)) + geom_polygon() "],["histograms-geom_histogram.html", "2.7 Histograms: geom_histogram()", " 2.7 Histograms: geom_histogram() ggplot(mpg, aes(hwy)) + geom_histogram() ## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`. "],["drawing-rectangles-geom_rect-geom_tile-geom_raster.html", "2.8 Drawing rectangles: geom_rect(); geom_tile(); geom_raster()", " 2.8 Drawing rectangles: geom_rect(); geom_tile(); geom_raster() ggplot(df, aes(c, y)) + geom_tile() "],["add-text-to-a-plot-geom_text.html", "2.9 Add text to a plot: geom_text()", " 2.9 Add text to a plot: geom_text() This requires the use of the label aesthetic, along with others # Filtering to simplify the example mpg |> filter(manufacturer == "ford") |> ggplot(aes(displ, hwy, label = model)) + geom_text() position and other parameters are also useful. mpg |> filter(manufacturer == "ford") |> ggplot(aes(displ, hwy, label = model)) + geom_text(position = position_dodge(width = 0.2), angle = 45) "],["exercise-solutions.html", "2.10 Exercise solutions", " 2.10 Exercise solutions 2.10.1 Exercise 1 What geoms would you use to draw each of the following named plots? scatterplot = geom_point() line chart = geom_line() histogram = geom_histogram() bar chart = geom_bar() or geom_col() pie chart = geom_bar() with coord_polar() ggplot(data_diamond_count, aes(cut, count)) + geom_col() ggplot(diamonds, aes(x = factor(1), fill = factor(cut))) + geom_bar(width = 1) + coord_polar(theta = "y") 2.10.2 Exercise 2 geom_path() connects points in order of appearance. geom_line connects points from left to right. p + geom_path() geom_polygon() draws polygons which are filled paths. p + geom_polygon() geom_line() connects points from left to right. p + geom_line() 2.10.3 Exercise 3 What low-level geoms are used to draw geom_smooth()? geom_smooth() fits a smoother to data, displaying the smooth and its standard error, allowing you to see a dominant pattern within a scatterplot with a lot of “noise”. The low level geom for geom_smooth() are geom_path(), geom_area() and geom_point(). ggplot(mpg, aes(displ, hwy)) + geom_point() + geom_smooth() ## `geom_smooth()` using method = 'loess' and formula = 'y ~ x' What low-level geoms are used to draw geom_boxplot()? Box plots are used to summarize the distribution of a set of points using summary statistics. The low level geom for geom_boxplot() are geom_rect(), geom_line() and geom_point(). ggplot(mpg, aes(drv, hwy)) + geom_boxplot() What low-level geoms are used to draw geom_violin()? Violin plots show a compact representation of the density of the distribution highlighting the areas where most of the points are found. The low level geom for geom_violin() are geom_area() and geom_path(). ggplot(mpg, aes(drv, hwy)) + geom_violin() "],["meeting-videos-2.html", "2.11 Meeting Videos", " 2.11 Meeting Videos 2.11.1 Cohort 1 Meeting chat log 00:13:39 priyanka gagneja: I forgot to mention that since this is relatively smaller chapter Ryan has prepared some material introducing Chapter 4 for today and he will talking about the entire chapter next week. 00:16:38 priyanka gagneja: that's correct 00:16:42 priyanka gagneja: that's my understanding too 00:18:38 priyanka gagneja: what do you mean circles .. can you share a more detailed example 00:21:59 Jiwan Heo: tibble(id = 1:10) %>% mutate(x = cos(2*pi*id/10), y = sin(2*pi*id/10)) %>% ggplot(aes(x, y)) + geom_line() + coord_equal() 00:22:05 Jiwan Heo: vs tibble(id = 1:10) %>% mutate(x = cos(2*pi*id/10), y = sin(2*pi*id/10)) %>% ggplot(aes(x, y)) + geom_path() + coord_equal() 00:35:42 priyanka gagneja: Thank you Ryan !! 00:38:16 priyanka gagneja: need a min 00:52:22 Michael Haugen: “Side rail” no pun intended 00:52:34 Ryan S: lol 00:52:34 Michael Haugen: sounds great 00:54:02 Ryan Metcalf: I was going to use Derail…..no pun intended! 00:54:05 Ryan Metcalf: Thanks you everyone! "],["collective-geoms.html", "Chapter 3 Collective Geoms ", " Chapter 3 Collective Geoms "],["general-housekeeping-items-1.html", "3.1 General Housekeeping Items", " 3.1 General Housekeeping Items This is a learning opportunity so feel free to ask any question at any time. Take time to learn the theory, in particular Grammar of Graphics. Please do the chapter exercises. Second-best learning opportunity! Please plan to facilitate one of the discussions. Best learning opportunity! "],["learning-objectives-1.html", "3.2 Learning Objectives", " 3.2 Learning Objectives Understand the difference between individual geoms and collective geoms Explore some plots that use individual and collective geoms together Reinforce understand of the Grammar of Graphics (particularly the use of layers) to create plots "],["quick-intuition-on-collective-geoms.html", "3.3 Quick Intuition on Collective Geoms", " 3.3 Quick Intuition on Collective Geoms Last chapter was on individual geoms. This chapter is on collective geoms. Oversimplification (but maybe useful) individual numbers vs the sum of the numbers sum converts a series of numbers (“individual”): 4, 7, 9, 3, 3 to a single number (“collective”): 26 home prices under individual geoms each home price has a point on a plot/table under collective geoms we may use median as a single number that summarizes all individuals This blog post by Simon Jackson illustrates these foundations using mtcars. The points are individual geoms and the bars are a collective geom showing the average of the individual observations. id <- mtcars %>% tibble::rownames_to_column() %>% as_tibble() %>% mutate(am = factor(am, levels = c(0, 1), labels = c("automatic", "manual"))) gd <- id %>% group_by(am) %>% summarise(hp = mean(hp)) ggplot(id, aes(x = am, y = hp, color = am, fill = am)) + geom_bar(data = gd, stat = "identity", alpha = 0.3) + ggrepel::geom_text_repel(aes(label = rowname), color = "black", size = 2.5, segment.color = "grey") + geom_point() + guides(color = "none", fill = "none") + theme_bw() + labs( title = "Car horespower by transmission type", x = "Transmission", y = "Horsepower" ) Next, a separate longitudinal study from the blog post (because the book example is also a longitudinal study). This example uses the ourworldindata dataset which shows healthcare spending per country over time. #library(devtools) #install_github("drsimonj/ourworldindata") library(ourworldindata) id <- financing_healthcare %>% filter(continent %in% c("Oceania", "Europe") & between(year, 2001, 2005)) %>% select(continent, country, year, health_exp_total) %>% na.omit() raw data id ## # A tibble: 275 × 4 ## continent country year health_exp_total ## <chr> <chr> <int> <dbl> ## 1 Europe Albania 2001 198. ## 2 Europe Albania 2002 225. ## 3 Europe Albania 2003 236. ## 4 Europe Albania 2004 264. ## 5 Europe Albania 2005 277. ## 6 Europe Andorra 2001 1432. ## 7 Europe Andorra 2002 1565. ## 8 Europe Andorra 2003 1601. ## 9 Europe Andorra 2004 1662. ## 10 Europe Andorra 2005 1794. ## # ℹ 265 more rows individual observations are at the combined country-year level. For the purposes of plotting, though, the “individual geom” will just be the country and all of the yearly observations for each country. gd <- id %>% group_by(continent, year) %>% summarise(health_exp_total = mean(health_exp_total)) ggplot(id, aes(x = year, y = health_exp_total, color = continent)) + geom_line(aes(group = country), alpha = 0.3) + geom_line(data = gd, alpha = 0.8, size = 3) + theme_bw() + labs( title = "Changes in healthcare spending\\nacross countries and world regions", x = NULL, y = "Total healthcare investment ($)", color = NULL ) "],["from-the-ggplot2-book.html", "3.4 From the ggplot2 book", " 3.4 From the ggplot2 book dataset called Oxboys which shows the age and corresponding height of 26 boys from Oxford. also a longitudinal study. note that the age is standardized. data(Oxboys, package = "nlme") head(Oxboys, 9) ## Grouped Data: height ~ age | Subject ## Subject age height Occasion ## 1 1 -1.0000 140.5 1 ## 2 1 -0.7479 143.4 2 ## 3 1 -0.4630 144.8 3 ## 4 1 -0.1643 147.1 4 ## 5 1 -0.0027 147.7 5 ## 6 1 0.2466 150.2 6 ## 7 1 0.5562 151.7 7 ## 8 1 0.7781 153.3 8 ## 9 1 0.9945 155.8 9 3.4.1 Multiple Groups, One Aesthetic As the book says: In many situations, you want to separate your data into groups, but render them in the same way. In other words, you want to be able to distinguish individual subjects but not identify them. sometimes you want the individual geom to be a group of observations for the same individual. you do this by adding a group argument to the aesthetic. If you’re trying to figure out which variable to use as the grouping variable, fill in the blank “I have multiple observations for each _____”. Or for longitudinal studies, “I want to plot one line over time for each _____”. What’s the grouping variable for Oxboys? In the case of Oxboys, we want to plot a line over time for each boy, so Subject is the grouping variable in the aesthetic. ggplot(Oxboys, aes(age, height, group = Subject)) + geom_point() + geom_line() incorrectly specifying the grouping variable leads to a “characteristic sawtooth appearance”. ggplot(Oxboys, aes(age, height)) + geom_point() + geom_line() 3.4.2 Different Groups on Different Layers From the book: Sometimes we want to plot summaries that use different levels of aggregation: one layer might display individuals, while another displays an overall summary. now that we have plotted individual geoms, let’s add a collective geom which is the trendline for all boys together. ggplot(Oxboys, aes(age, height, group = Subject)) + geom_line() + geom_point() + geom_smooth(method = "lm", se = FALSE) ## `geom_smooth()` using formula = 'y ~ x' #> `geom_smooth()` using formula 'y ~ x' something doesn’t look right expecting a collective geom (one summary line for all subjects), but we got individual geoms again – a trendline for each individual instead of a trendline for all individuals. “grouping controls both the display of the geoms, and the operation of the stats: one statistical transformation is run for each group”. we got multiple geom_smooths because we had the grouping variable in the ggplot line so the grouping flows down to all layers of the plot to get what we intend, we need to uncouple the grouping variable at the ggplot layer and add it where we want the grouping to happen, namely only at the geom_line layer. That allows the default grouping from the ggplot layer (i.e., no special grouping or just group on the whole dataset) to flow down to the geom_smooth layer. ggplot(Oxboys, aes(age, height)) + geom_line(aes(group = Subject)) + geom_point() + geom_smooth(method = "lm", size = 2, se = FALSE) ## `geom_smooth()` using formula = 'y ~ x' #> `geom_smooth()` using formula 'y ~ x' 3.4.3 Overriding the Default Grouping In the last exercise, we finally got the grouping right. This hints at the approach of overriding the default grouping. By adding the grouping to geom_line, we overrode the default grouping, which was “no special grouping”. Here’s another example to help illustrate this point a little better. Thanks to this blog post. Subtitles are added to these plots to describe what’s going on. ggplot(mpg, aes(drv, hwy)) + geom_jitter() + stat_boxplot(fill = NA) + labs(subtitle = "stat_boxplot automatically uses the groups set by the categorical variable drv.\\nNotice that there is only one boxplot for each value of drv.") ggplot(mpg, aes(drv, hwy, color = factor(year))) + geom_jitter() + stat_boxplot(fill = NA) + labs(subtitle = "by now adding color based on year, it creates a new group for the boxplots as well,\\nand there are now two for each categorical. This may not be what you want.") ggplot(mpg, aes(drv, hwy, color = factor(year))) + geom_jitter() + stat_boxplot(fill = NA, aes(group = drv)) + labs(subtitle = "we override the default or earlier grouping by adding\\na group -- inside the aes -- on the layer where we want it") ## Warning: The following aesthetics were dropped during statistical transformation: ## colour. ## ℹ This can happen when ggplot fails to infer the correct grouping structure in ## the data. ## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical ## variable into a factor? 3.4.4 A couple of exercises mpg %>% head(2) ## # A tibble: 2 × 11 ## manufacturer model displ year cyl trans drv cty hwy fl class ## <chr> <chr> <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr> ## 1 audi a4 1.8 1999 4 auto(l5) f 18 29 p compa… ## 2 audi a4 1.8 1999 4 manual(m5) f 21 29 p compa… #Draw a boxplot of hwy for each value of cyl, without turning cyl into a factor. What extra aesthetic do you need to set? # Wrong... but cyl is an integer data type -- are integers considered continuous? ggplot(mpg, aes(cyl, hwy)) + geom_boxplot() ## Warning: Continuous x aesthetic ## ℹ did you forget `aes(group = ...)`? # Right ggplot(mpg, aes(cyl, hwy, group = as.factor(cyl))) + geom_boxplot() #Modify the following plot so that you get one boxplot per integer value of displ. ggplot(mpg, aes(displ, cty)) + geom_boxplot() ## Warning: Continuous x aesthetic ## ℹ did you forget `aes(group = ...)`? # probably better ways to do this, especially ways to make the boxplot line up with the x-axis ggplot(mpg, aes(x = ceiling(displ), cty, group = ceiling(displ))) + geom_boxplot() 3.4.5 Matching Aesthetics to Graphic Objects (Not covered in the preso) "],["meeting-videos-3.html", "3.5 Meeting Videos", " 3.5 Meeting Videos 3.5.1 Cohort 1 Meeting chat log 00:21:57 Michael Haugen: only thing I can think of is if 1 equals the first column of the data frame. 01:02:43 priyanka gagneja: thanks Ryan and everyone else 01:06:10 Jiwan Heo: https://github.com/r4ds/bookclub-ps4ds "],["statistical-summaries.html", "Chapter 4 Statistical Summaries", " Chapter 4 Statistical Summaries Learning Objectives: Use ggplot2 to plot possible uncertainty in your data Determine which geometric object (geom) best presents your type of data "],["defintions-in-this-chapter.html", "4.1 Defintions (in this Chapter)", " 4.1 Defintions (in this Chapter) discrete value: a finite number, something that is countable with beginning and end (input user definition welcomed) continuous value: infinite number, something that never ends. Infinity is continous. (input user definition welcomed) grobs: graphical object overplotting: too much data on scatterplot making underlying relationships obscure "],["revealing-uncertainty.html", "4.2 Revealing Uncertainty", " 4.2 Revealing Uncertainty Four primary types of geometric objects (geom) are used: 1. Discrete x, range: geom_errorbar(), geom_linerange() 2. Discrete x, range & center: geom_crossbar(), geom_pointrange() 3. Continuous x, range: geom_ribbon() 4. Continuous x, range & center: geom_smooth(stat = \"identity\") y <- c(18, 11, 16) df <- data.frame(x = 1:3, y = y, se = c(1.2, 0.5, 1.0)) base <- ggplot(df, aes(x, y, ymin = y - se, ymax = y + se)) base + geom_crossbar() base + geom_pointrange() base + geom_smooth(stat = "identity") df ## x y se ## 1 1 18 1.2 ## 2 2 11 0.5 ## 3 3 16 1.0 base + geom_errorbar() base + geom_linerange() base + geom_ribbon() "],["weighted-data.html", "4.3 Weighted Data", " 4.3 Weighted Data If each row of your dataframe contains multiple observations, we can use a weight to visually give scale to observations # Unweighted ggplot(midwest, aes(percwhite, percbelowpoverty)) + geom_point() # Weight by population ggplot(midwest, aes(percwhite, percbelowpoverty)) + geom_point(aes(size = poptotal / 1e6)) + scale_size_area("Population\\n(millions)", breaks = c(0.5, 1, 2, 4)) # Unweighted ggplot(midwest, aes(percwhite, percbelowpoverty)) + geom_point() + geom_smooth(method = lm, size = 1) ## `geom_smooth()` using formula = 'y ~ x' # Weighted by population ggplot(midwest, aes(percwhite, percbelowpoverty)) + geom_point(aes(size = poptotal / 1e6)) + geom_smooth(aes(weight = poptotal), method = lm, size = 1) + scale_size_area(guide = "none") ## `geom_smooth()` using formula = 'y ~ x' ggplot(midwest, aes(percbelowpoverty)) + geom_histogram(binwidth = 1) + ylab("Counties") ggplot(midwest, aes(percbelowpoverty)) + geom_histogram(aes(weight = poptotal), binwidth = 1) + ylab("Population (1000s)") Question for the group: Is the above ylab correct? Check out the next two figures, can you see the difference? ggplot(midwest, aes(percbelowpoverty)) + geom_histogram(aes(weight = poptotal/1e3), binwidth = 1) + ylab("Population (1000s)") ggplot(midwest, aes(percbelowpoverty)) + geom_histogram(aes(weight = poptotal/1e6), binwidth = 1) + ylab("Population (millions)") "],["displaying-distributions.html", "4.4 Displaying distributions", " 4.4 Displaying distributions Using built-in diamonds dataset Figure 4.1: Diamond Dimensions For 1-Dimensional continuous data (1d), the histogram is arguably the most important geom ggplot(diamonds, aes(depth)) + geom_histogram() ## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`. ggplot(diamonds, aes(depth)) + geom_histogram(binwidth = 0.1) + xlim(55, 70) ## Warning: Removed 45 rows containing non-finite outside the scale range ## (`stat_bin()`). ## Warning: Removed 2 rows containing missing values or values outside the scale range ## (`geom_bar()`). Never rely on the defaults. Always adjust your bin or xlim to “zoom” in our out of your data. There is no hard or fast rule, only experimentation to discover coorelation in your plot. For your audience/reader, ensure you add a caption for your scale, for example binwidth. Three ways to compare distribution: - Show small multiples of the histogram, facet_wrap(~ var). - Use colour and a frequency polygon, geom_freqpoly(). - Use a “conditional density plot”, geom_histogram(position = \"fill\"). ggplot(diamonds, aes(depth)) + geom_freqpoly(aes(colour = cut), binwidth = 0.1, na.rm = TRUE) + xlim(58, 68) + theme(legend.position = "none") ggplot(diamonds, aes(depth)) + geom_histogram(aes(fill = cut), binwidth = 0.1, position = "fill", na.rm = TRUE) + xlim(58, 68) + theme(legend.position = "none") You can also plot density using geom_density(). Use a density plot when you know that the underlying density is smooth, continuous and unbounded. ggplot(diamonds, aes(depth)) + geom_density(na.rm = TRUE) + xlim(58, 68) + theme(legend.position = "none") ggplot(diamonds, aes(depth, fill = cut, colour = cut)) + geom_density(alpha = 0.2, na.rm = TRUE) + xlim(58, 68) + theme(legend.position = "none") It is often the case and advisable to sacrifice quality for quantity. The following three types of graph provide examples of this thought. geom_boxplot(): ggplot(diamonds, aes(clarity, depth)) + geom_boxplot() ggplot(diamonds, aes(carat, depth)) + geom_boxplot(aes(group = cut_width(carat, 0.1))) + xlim(NA, 2.05) ## Warning: Removed 997 rows containing missing values or values outside the scale range ## (`stat_boxplot()`). geom_violin(): ggplot(diamonds, aes(clarity, depth)) + geom_violin() ggplot(diamonds, aes(carat, depth)) + geom_violin(aes(group = cut_width(carat, 0.1))) + xlim(NA, 2.05) ## Warning: Removed 997 rows containing non-finite outside the scale range ## (`stat_ydensity()`). geom_dotplot(): 4.4.1 Exercise: What binwidth tells you the most interesting story about the distribution of carat? >The number of bins or the binwidth should be exploration exercise. There is not direct hard or fast rule for scaling the binwidth. What is important is to find the appropriate size that best captures the representation (or distribution) of your analysis. This correlates to your story as you are explaining the importance. Find a binwidth that best captures your ideas. Draw a histogram of price. What interesting patterns do you see? ggplot(diamonds, aes(price)) + geom_histogram(binwidth = 5) The smaller the quantity (assuming quality), the higher the price. I presume that carat size would also have a strong correlation with quantity and price. How does the distribution of price vary with clarity? ggplot(diamonds, aes(clarity, price)) + geom_violin() ggplot(diamonds, aes(clarity, price)) + geom_boxplot() I presume using different geoms, the higher the clarity, the higher the price, the fewer the quantity. Overlay a frequency polygon and density plot of depth. What computed variable do you need to map to y to make the two plots comparable? (You can either modify geom_freqpoly() or geom_density().) Not completed. "],["dealing-with-overplotting.html", "4.5 Dealing with overplotting", " 4.5 Dealing with overplotting Scatterplot is a very important tool for assessing relationship Too large a dataset may obscure any true relationship This is called Over plotting To compensate for Over plotting, tweaking the aesthetic can help. Techniques like hollow glyphs can help. df <- data.frame(x = rnorm(2000), y = rnorm(2000)) norm <- ggplot(df, aes(x, y)) + xlab(NULL) + ylab(NULL) norm + geom_point() norm + geom_point(shape = 1) # Hollow circles norm + geom_point(shape = 96) # Pixel sized Alternative ways using large data sets, you can use alpha blending (transparency). If you specify alpha as a ratio, the denominator gives the number of points that must be over plotted to give a solid color. norm + geom_point(alpha = 1 / 3) norm + geom_point(alpha = 1 / 5) norm + geom_point(alpha = 1 / 10) geom_jitter() can be used if your data has some discreteness. By default, 40% is used. You can overide the default with width and height arguments. Alternatively, we can think of overplotting as a 2d density estimation problem, which gives rise to two more approaches: Bin the points and count the number in each bin, then visualise that count (the 2d generalisation of the histogram), geom_bin2d(). The code below compares square and hexagonal bins, using parameters bins and binwidth to control the number and size of the bins. norm + geom_bin2d() norm + geom_bin2d(bins = 10) library(hexbin) norm + geom_hex() norm + geom_hex(bins = 10) Another approach to dealing with overplotting is to add data summaries to help guide the eye to the true shape of the pattern within the data. "],["statistical-summaries-1.html", "4.6 Statistical Summaries", " 4.6 Statistical Summaries geom_histogram() and geom_bin2d() use a familiar geom, geom_bar() and geom_raster(), combined with a new statistical transformation, stat_bin() and stat_bin2d(). stat_bin() and stat_bin2d() combine the data into bins and count the number of observations in each bin. But what if we want a summary other than count? So far, we’ve just used the default statistical transformation associated with each geom. Now we’re going to explore how to use stat_summary_bin() to stat_summary_2d() to compute different summaries. ggplot(diamonds, aes(color)) + geom_bar() ggplot(diamonds, aes(color, price)) + geom_bar(stat = "summary_bin", fun = mean) ggplot(diamonds, aes(table, depth)) + geom_bin2d(binwidth = 1, na.rm = TRUE) + xlim(50, 70) + ylim(50, 70) ggplot(diamonds, aes(table, depth, z = price)) + geom_raster(binwidth = 1, stat = "summary_2d", fun = mean, na.rm = TRUE) + xlim(50, 70) + ylim(50, 70) ## Warning: Raster pixels are placed at uneven horizontal intervals and will be shifted ## ℹ Consider using `geom_tile()` instead. ## Raster pixels are placed at uneven horizontal intervals and will be shifted ## ℹ Consider using `geom_tile()` instead. So far we’ve considered two classes of geoms: Simple geoms where there’s a one-on-one correspondence between rows in the data frame and physical elements of the geom Statistical geoms where introduce a layer of statistical summaries in between the raw data and the result Although ggplot2 does not have direct 3d support, it does provide the ability to plot 2d images representing 3d data. These include: contours, colored tiles, and bubble plots. ggplot(faithfuld, aes(eruptions, waiting)) + geom_contour(aes(z = density, colour = ..level..)) ## Warning: The dot-dot notation (`..level..`) was deprecated in ggplot2 3.4.0. ## ℹ Please use `after_stat(level)` instead. ## This warning is displayed once every 8 hours. ## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was ## generated. ggplot(faithfuld, aes(eruptions, waiting)) + geom_raster(aes(fill = density)) # Bubble plots work better with fewer observations small <- faithfuld[seq(1, nrow(faithfuld), by = 10), ] ggplot(small, aes(eruptions, waiting)) + geom_point(aes(size = density), alpha = 1/3) + scale_size_area() "],["meeting-videos-4.html", "4.7 Meeting Videos", " 4.7 Meeting Videos 4.7.1 Cohort 1 Meeting chat log 00:32:41 Michael Haugen: geom_errorbar() otherwise known as tie fighter plot 00:33:12 Gustavo R. Brito: There's some good explanations about geom_smooth (and se too) in rdocumentation: https://rdocumentation.org/packages/ggplot2/versions/3.3.5/topics/geom_smooth 00:46:10 priyanka gagneja: the Grey area is the confidence interval 00:48:32 Federica Gazzelloni: The geom_smooth() function in ggplot2 can plot fitted lines from models with a simple structure. Supported model types include models fit with lm() , glm() , nls() , and mgcv::gam() . ... By default you will get confidence intervals plotted in geom_smooth() . 00:49:53 Federica Gazzelloni: This is a linear model fit, so I use method = "lm". 00:50:21 Stan Piotrowski: You can use “scale_y_continuous()” and some of the functions from the “scales” package to modify axes. 00:50:27 June Choe: the model gets fitted by StatSmooth$compute_group() here, if you're curious about the code! https://github.com/tidyverse/ggplot2/blob/759c63c2fd9e00ba3322c1b74b227f63c98d2e06/R/stat-smooth.r#L156-L173 00:51:31 Federica Gazzelloni: https://aosmith.rbind.io/2018/11/16/plot-fitted-lines/ 00:56:58 Federica Gazzelloni: some formula from the documentation: Formula to use in smoothing function, eg. y ~ x, y ~ poly(x, 2), y ~ log(x). NULL by default, in which case method = NULL implies formula = y ~ x when there are fewer than 1,000 observations and formula = y ~ s(x, bs = "cs") otherwise. 00:57:52 priyanka gagneja: thats ok , keep going. we can pick up the rest next time we meet. 00:58:21 Lydia Gibson: I’m going to run to my appointment. See you all next week! 00:58:41 Lydia Gibson: Sorry, in two weeks. 00:59:26 Ryan S: as a side note -- it seemed like the topic of stats (i.e., stat = "identity")…. this topic seemed to get very light treatment in the text. to me it seems like this idea of how stats work is a huge topic that requires a lot of understanding and practice. 00:59:52 Stan Piotrowski: I agree, Ryan. 01:00:18 Ryan S: suggest someone who understands this topic (and has the capacity to talk to it) may be willing to take 15 mins on it next time? 01:01:13 June Choe: I also agree (and would be happy to do this, just not this month!) I feel like we could use another week on stat before we're thrown into the Extending ggplot2 section - maybe around when we cover scales 01:02:25 Michael Haugen: Stat part 2 next week? 01:02:33 Stan Piotrowski: That sounds like a good idea to me! That’ll give us some time to dig into the code and figure out what’s going on 01:02:48 Ryan S: I think we're two weeks away (US holiday next week) 01:02:49 priyanka gagneja: +1 Michael "],["maps.html", "Chapter 5 Maps", " Chapter 5 Maps Learning Objectives: - Plot simple maps using geom_polygon() - Using simple features sf to plot GIS data geom_sf() - Work with map projections and underlying sf data structure - Draw maps using Raster data Plotting geospacial data is a common visualization task. The process may require spcialized tools. You can decompse the problem into two paths: - Using one data source to draw a map (if you have GIS data) - Adding metadata from another information source to the map (more common with relation to geographic areas) NOTE: X = Longitude, Y=Latitude. When pronounced “Lat/Lon” it is actually measured as Y/X. Not confusing….just keeping with vocabulary and measurements! "],["polygon-maps.html", "5.1 Polygon Maps", " 5.1 Polygon Maps The simplest approach to mapping is using geom_polygon(). This forms bounderies around regions. library(ggplot2) mi_counties <- map_data("county", "michigan") %>% select(lon = long, lat, group, id = subregion) head(mi_counties) ## lon lat group id ## 1 -83.88675 44.85686 1 alcona ## 2 -83.36536 44.86832 1 alcona ## 3 -83.36536 44.86832 1 alcona ## 4 -83.33098 44.83968 1 alcona ## 5 -83.30806 44.80530 1 alcona ## 6 -83.30233 44.77665 1 alcona In this data set we have four variables: - lat: Latitude of the vertex (as measured by horizontal paths) - long: Longitude of the vertex (as measured by vertical paths) - id: name of the region - group: unique identifier for contiguous areas within a region ggplot(mi_counties, aes(lon, lat)) + geom_point(size = .25, show.legend = FALSE) + coord_quickmap() ggplot(mi_counties, aes(lon, lat, group = group)) + geom_polygon(fill = "white", colour = "grey50") + coord_quickmap() In this plot, coord_quickmap() is used to adjust the axes to ensure longitude and latitude are rendered on the same scale For a more advanced use of ggplot2 for mapping, we’ll see the use of geom_sf() and coord_sf() to handle spatial data specified in simple features format. "],["simple-features-maps.html", "5.2 Simple Features Maps", " 5.2 Simple Features Maps You can use the above examples…but not real world pratical. Instead, most GIS data is written as simple features and produced by the (Open Geospatial Consortium]https://www.ogc.org/) 5.2.1 Layered Maps 5.2.2 Labelled Maps 5.2.3 Adding Other Geoms "],["map-projections.html", "5.3 Map Projections", " 5.3 Map Projections "],["working-with-sf-data.html", "5.4 Working with sf Data", " 5.4 Working with sf Data "],["raster-maps.html", "5.5 Raster Maps", " 5.5 Raster Maps "],["data-sources.html", "5.6 Data Sources", " 5.6 Data Sources "],["meeting-videos-5.html", "5.7 Meeting Videos", " 5.7 Meeting Videos 5.7.1 Cohort 1 Meeting chat log 00:11:25 June Choe: hello! 00:15:21 SriRam: Hi all, I am new here, I came to know about this from ISLR book club 00:16:14 Stan Piotrowski: Great have to have you here, SriRam! Some of us are also in the ISLR book club and I think this is a nice complement to that material 00:25:26 June Choe: I'd like to see the error! 00:26:54 June Choe: I think you'd have to add a geom_labe() layer 00:27:08 June Choe: but as Stan said it'll render text at every point 00:27:29 June Choe: after polygon would draw it on top 00:33:36 Michael Haugen: Reminds me of a Flight of the Concords episode 00:35:04 SriRam: 23.5 00:40:09 SriRam: It would be incorrect data to have multiple geometries on same record 00:47:44 Lydia Gibson: It’s spelled right… or at least that’s how it’s spelled in the book 00:48:16 June Choe: hm maybe the sf_label and sf_text layers also need to take the geometry aesthetic 00:49:02 June Choe: label.padding I think is from geom_label (the white space between text and bounding box) 00:52:51 Federica Gazzelloni: viridis 00:53:30 Federica Gazzelloni: scale_color_viridis() 00:53:42 Federica Gazzelloni: scale_fill_viridis() 00:54:01 SriRam: I think viridis is better for continuous values 00:54:57 Federica Gazzelloni: viridian package: https://cran.r-project.org/web/packages/viridis/vignettes/intro-to-viridis.html 00:55:08 Federica Gazzelloni: viridis package: https://cran.r-project.org/web/packages/viridis/vignettes/intro-to-viridis.html 00:55:51 Federica Gazzelloni: more: https://www.rdocumentation.org/packages/viridis/versions/0.5.1/topics/scale_color_viridis 00:56:21 Michael Haugen: David Robinson uses scale_fill_viridis_c() for a map in his most recent Tidy Tuesday Screen cast. See around 23minute mark: Tidy Tuesday live screencast: Analyzing registered nurses in R. https://www.youtube.com/watch?v=UVmxHb2Daeo&t=486s 01:12:20 Lydia Gibson: Thank you Ryan!! 01:12:34 Federica Gazzelloni: thanks Ryan 01:12:54 Stan Piotrowski: Thanks Ryan! Meeting chat log 00:09:20 priyanka gagneja: sorry everyone I just joined 00:09:38 Federica Gazzelloni: Hello! 00:10:02 priyanka gagneja: and will probably be a little in and out .. got a not so happy baby today at home 00:18:11 Stan Piotrowski: I need to take off for a conflict that just came up. Catch up with you all on slack! 00:20:46 SriRam: It is the image product ID 00:21:07 SriRam: All the IDE’s 00:21:50 Kent Johnson: The IDE codes are defined here: https://ropensci.github.io/bomrang/reference/get_available_imagery.html 00:27:32 SriRam: The process is called geo-referencing 00:27:58 SriRam: And image is called a geo-referenced image 00:31:33 SriRam: Yes, it is a reference system 00:31:38 SriRam: A coordinate reference 00:33:04 Federica Gazzelloni: this is the bit that makes the reference: crs = st_crs(sat_vis) 00:49:27 Jiwan Heo: something just came up, and have to leave. See you all next week! 00:59:58 priyanka gagneja: I am signing off now , can someone please address and sign off on my behalf. I will send a msg later on slack "],["networks.html", "Chapter 6 Networks", " Chapter 6 Networks Learning Objectives What is Network data? New functions and geoms Visualization of nodes and edges as abstract concepts "],["introduction-2.html", "6.1 Introduction", " 6.1 Introduction This chapter illustrates how to make a Network of data, and how to make practical examples using some of the available packages: {tidygraph} for Tidy API for Graph Manipulation {ggraph} for network visualization {igraph} for generating random and regular graphs "],["what-is-network-data.html", "6.2 What is network data?", " 6.2 What is network data? Networks data consists of entities (nodes or vertices) and their relation (edges or links). Edges can be: directed or undirected 6.2.1 A tidy network manipulation API The first package is tidygraph() a dplyr API for network data. New functions: activate() informs tidygraph on which part of the network you want to work on, either nodes or edges. .N() which gives access to the node data of the current graph even when working with the edges - .E() and .G() to access the edges or the whole graph) In this example we create a graph, assign a random label to the nodes, and sort the edges based on the label of their source node. The function play_erdos_renyi() creates graphs directly through sampling of different attributes. library(tidygraph) graph <- tidygraph::play_erdos_renyi(n = 10, p = 0.2) %>% activate(nodes) %>% mutate(class = sample(letters[1:4], n(), replace = TRUE)) %>% activate(edges) %>% arrange(.N()$class[from]) graph ## # A tbl_graph: 10 nodes and 14 edges ## # ## # A directed simple graph with 1 component ## # ## # Edge Data: 14 × 2 (active) ## from to ## <int> <int> ## 1 4 5 ## 2 9 2 ## 3 10 3 ## 4 2 5 ## 5 2 9 ## 6 7 1 ## 7 3 2 ## 8 8 3 ## 9 8 4 ## 10 3 5 ## 11 6 10 ## 12 8 6 ## 13 7 8 ## 14 8 10 ## # ## # Node Data: 10 × 1 ## class ## <chr> ## 1 a ## 2 c ## 3 d ## # ℹ 7 more rows 6.2.2 Conversion Data can be converted with as_tbl_graph(), a data structure for tidy graph manipulation. It converts a data frame encoded as an edgelist, as well as converting the result of hclust() data(highschool, package = "ggraph") head(highschool) ## from to year ## 1 1 14 1957 ## 2 1 15 1957 ## 3 1 21 1957 ## 4 1 54 1957 ## 5 1 55 1957 ## 6 2 21 1957 With as_tbl_graph() we obtain: hs_graph <- tidygraph::as_tbl_graph(highschool, directed = FALSE) hs_graph ## # A tbl_graph: 70 nodes and 506 edges ## # ## # An undirected multigraph with 1 component ## # ## # Node Data: 70 × 0 (active) ## # ## # Edge Data: 506 × 3 ## from to year ## <int> <int> <dbl> ## 1 1 13 1957 ## 2 1 14 1957 ## 3 1 20 1957 ## # ℹ 503 more rows 6.2.2.1 hclust() and dist() functions: In this example the luv_colours() function allows for all built-in colors() translated into Luv colour space, a data frame with 657 observations and 4 variables: luv_colours luv_colours <- as.data.frame(convertColor(t(col2rgb(colors())), "sRGB", "Luv")) luv_colours$col <- colors() head(luv_colours) ## L u v col ## 1 9341.570 -3.370649e-12 0.0000 white ## 2 9100.962 -4.749170e+02 -635.3502 aliceblue ## 3 8809.518 1.008865e+03 1668.0042 antiquewhite ## 4 8935.225 1.065698e+03 1674.5948 antiquewhite1 ## 5 8452.499 1.014911e+03 1609.5923 antiquewhite2 ## 6 7498.378 9.029892e+02 1401.7026 antiquewhite3 This visualization represent the content of the dataset, then we will see how it looks in a grapg representation. ggplot(luv_colours, aes(u, v)) + geom_point(aes(colour = col), size = 3) + scale_color_identity() + coord_equal() + theme_void() For example, selecting the first 3 variables and plotting the data with the plot() function we can see that there are some connections within the elements of the dataset, as the colors are connected to each other. ggplot2::luv_colours[, 1:3] %>% head ## L u v ## 1 9341.570 -3.370649e-12 0.0000 ## 2 9100.962 -4.749170e+02 -635.3502 ## 3 8809.518 1.008865e+03 1668.0042 ## 4 8935.225 1.065698e+03 1674.5948 ## 5 8452.499 1.014911e+03 1609.5923 ## 6 7498.378 9.029892e+02 1401.7026 plot(ggplot2::luv_colours[, 1:3]) luv_clust <- hclust(dist(ggplot2::luv_colours[, 1:3])) class(luv_clust) ## [1] "hclust" With the tidygraph::as_tbl_graph() function we can transorm the dataset into classes “tbl_graph”, “igraph” to make it ready to use for making a visualization of the network data. luv_graph <- as_tbl_graph(luv_clust) luv_graph;class(luv_graph) ## # A tbl_graph: 1313 nodes and 1312 edges ## # ## # A rooted tree ## # ## # Node Data: 1,313 × 4 (active) ## height leaf label members ## <dbl> <lgl> <chr> <int> ## 1 0 TRUE "101" 1 ## 2 0 TRUE "427" 1 ## 3 778. FALSE "" 2 ## 4 0 TRUE "571" 1 ## 5 0 TRUE "426" 1 ## 6 0 TRUE "424" 1 ## 7 0 TRUE "425" 1 ## 8 0 FALSE "" 2 ## 9 590. FALSE "" 3 ## 10 1652. FALSE "" 4 ## # ℹ 1,303 more rows ## # ## # Edge Data: 1,312 × 2 ## from to ## <int> <int> ## 1 3 1 ## 2 3 2 ## 3 8 6 ## # ℹ 1,309 more rows ## [1] "tbl_graph" "igraph" 6.2.3 Algorithms The real benefit of networks comes from the different operations that can be performed on them using the underlying structure. luv_graph %>% tidygraph::activate(nodes) %>% mutate(centrality = centrality_pagerank()) %>% arrange(desc(centrality)) ## # A tbl_graph: 1313 nodes and 1312 edges ## # ## # A rooted tree ## # ## # Node Data: 1,313 × 5 (active) ## height leaf label members centrality ## <dbl> <lgl> <chr> <int> <dbl> ## 1 0 TRUE 207 1 0.000763 ## 2 0 TRUE 315 1 0.000763 ## 3 0 TRUE 208 1 0.000763 ## 4 0 TRUE 316 1 0.000763 ## 5 0 TRUE 205 1 0.000763 ## 6 0 TRUE 313 1 0.000763 ## 7 0 TRUE 206 1 0.000763 ## 8 0 TRUE 314 1 0.000763 ## 9 0 TRUE 245 1 0.000763 ## 10 0 TRUE 353 1 0.000763 ## # ℹ 1,303 more rows ## # ## # Edge Data: 1,312 × 2 ## from to ## <int> <int> ## 1 1187 1079 ## 2 1187 1080 ## 3 942 797 ## # ℹ 1,309 more rows "],["visualizing-networks.html", "6.3 Visualizing networks", " 6.3 Visualizing networks To visualize the Network data we use {ggraph}. It builds on top of {tidygraph} and {ggplot2} to allow a complete and familiar grammar of graphics for network data. 6.3.1 Setting up the visualization Syntax of {ggraph}: ggraph() %>% ggraph::geom_<functions> it will choose an appropriate layout based on the type of graph you provide. Getting Started guide to layouts 6.3.1.1 Specifying a layout What is the base requirenment? The data frame need to be with at least an x and y column and with the same number of rows as there are nodes in the input graph. As an example we take the data(highschool, package = \"ggraph\") and make a visualization of the graph: hs_graph <- tidygraph::as_tbl_graph(highschool, directed = FALSE) library(ggraph) ggraph(hs_graph) + geom_edge_link() + geom_node_point() A second example is with more features: hs_graph <- hs_graph %>% tidygraph::activate(edges) %>% mutate(edge_weights = runif(n())) ggraph(hs_graph, layout = "stress", weights = edge_weights) + geom_edge_link(aes(alpha = edge_weights)) + geom_node_point() + scale_edge_alpha_identity() In the following examples we see different layouts. Information about “drl” type of layout: DRL force-directed graph layout, an be found in the igraph package. layout <- ggraph::create_layout(hs_graph, layout = 'drl') ggraph(layout) + geom_edge_link() + geom_node_point() Instead of {tidygraph} we use {igraph}, with layout = “kk”: layout.kamada.kawai require(ggraph) require(igraph) hs_graph2 <- igraph::graph_from_data_frame(highschool) layout <- create_layout(hs_graph2, layout = "kk") ggraph(layout) + geom_edge_link(aes(colour = factor(year))) + geom_node_point() A very simple example to understand how to make a graph network is from this tutorial: Networks in igraph To understand a bit more about the graph structure we can use these functions: g1 <- igraph::graph( edges=c(1,2, 2,3, 3, 1), n=3, directed=F ) E(g1); # access to the edges ## + 3/3 edges from 43e43ed: ## [1] 1--2 2--3 1--3 V(g1); # the vertics ## + 3/3 vertices, from 43e43ed: ## [1] 1 2 3 g1[] # access to the matrix ## 3 x 3 sparse Matrix of class "dgCMatrix" ## ## [1,] . 1 1 ## [2,] 1 . 1 ## [3,] 1 1 . 6.3.1.2 Circularity Layouts can be linear and circular. coord_polar() changes the coordinate system and not affect the edges ggraph(luv_graph, layout = 'dendrogram', circular = TRUE) + geom_edge_link() + coord_fixed() ggraph(luv_graph, layout = 'dendrogram') + geom_edge_link() + coord_polar() + scale_y_reverse() 6.3.2 Drawing nodes points more specialized geoms: tiles geom_node_<functions> geom_node_point() geom_node_tile() Getting Started guide to nodes ggraph(luv_graph, layout = "stress") + geom_edge_link() + geom_node_point(aes(colour =factor(members)), show.legend = F) More features could be added to calculate node and edge centrality, such as: centrality_power() centrality_degree() ggraph(luv_graph, layout = "stress") + geom_edge_link() + geom_node_point(aes(colour =centrality_power())) Or making tiles: ggraph(luv_graph, layout = "treemap") + geom_node_tile(aes(fill = depth)) 6.3.3 Drawing edges geom_edge_link() draws a straight line between the connected nodes, actually what it does is: it will split up the line in a bunch of small fragments. geom_edge_link() geom_edge_link2() geom_edge_fan() geom_edge_parallel() geom_edge_elbow() geom_edge_bend() geom_edge_diagonal() Getting Started guide to edges The after_stat(index): set.seed(123) ggraph(hs_graph, layout = "stress") + geom_edge_link(aes(alpha = after_stat(index))) Here is an example about how to use node.class variable, the graph is the first that we have seen and it is artificially made with: tidygraph::play_erdos_renyi() graph <- tidygraph::play_erdos_renyi(n = 10, p = 0.2) %>% activate(nodes) %>% mutate(class = sample(letters[1:4], n(), replace = TRUE)) %>% activate(edges) %>% arrange(.N()$class[from]) ggraph(graph, layout = "stress") + geom_edge_link2( aes(colour = node.class), width = 3, lineend = "round") ggraph(hs_graph, layout = "stress") + geom_edge_parallel() Trees and specifically dendrograms: ggraph(luv_graph, layout = "dendrogram", height = height) + geom_edge_elbow() 6.3.3.1 Clipping edges around the nodes Example: using arrows to show directionality of edges set.seed(1011) ggraph(graph, layout = "stress") + geom_edge_link( arrow = arrow(), start_cap = circle(5, "mm"), end_cap = circle(5, "mm") ) + geom_node_point(aes(colour = class), size = 8) 6.3.3.2 An edge is not always a line Nodes and edges are abstract concepts and can be visualized in a multitude of ways. geom_edge_point() ggraph(hs_graph, layout = "matrix", sort.by = node_rank_traveller()) + geom_edge_point() 6.3.4 Faceting facet_nodes() facet_edges() facet_graph() ggraph(hs_graph, layout = "stress") + geom_edge_link() + geom_node_point() + facet_edges(~year) "],["conclusions.html", "6.4 Conclusions", " 6.4 Conclusions Making a {ggraph} means understanding of the different classes of datasets that can be used inside the function. Also, very important is to have clear in mind the structure of the graph that you would like to acheive for representing your data. There are many layouts available, and they differ by the class of provided data. In addition, to do not forget that you can make a network of data using {ggplot2} as well. 6.4.1 Resources: tidygraph website Data Imaginist Imaginist layouts Network analysis with r R and igraph Getting Started guide to layouts Getting Started guide to nodes Getting Started guide to edges "],["meeting-videos-6.html", "6.5 Meeting Videos", " 6.5 Meeting Videos 6.5.1 Cohort 1 Meeting chat log 00:09:20 priyanka gagneja: sorry everyone I just joined 00:09:38 Federica Gazzelloni: Hello! 00:10:02 priyanka gagneja: and will probably be a little in and out .. got a not so happy baby today at home 00:18:11 Stan Piotrowski: I need to take off for a conflict that just came up. Catch up with you all on slack! 00:20:46 SriRam: It is the image product ID 00:21:07 SriRam: All the IDE’s 00:21:50 Kent Johnson: The IDE codes are defined here: https://ropensci.github.io/bomrang/reference/get_available_imagery.html 00:27:32 SriRam: The process is called geo-referencing 00:27:58 SriRam: And image is called a geo-referenced image 00:31:33 SriRam: Yes, it is a reference system 00:31:38 SriRam: A coordinate reference 00:33:04 Federica Gazzelloni: this is the bit that makes the reference: crs = st_crs(sat_vis) 00:49:27 Jiwan Heo: something just came up, and have to leave. See you all next week! 00:59:58 priyanka gagneja: I am signing off now , can someone please address and sign off on my behalf. I will send a msg later on slack Meeting chat log 00:15:36 Ryan S: https://www.youtube.com/playlist?list=PLkrJrLs7xfbWjD2rp3pIV85lby-tR3Cnu 00:15:48 Lydia Gibson: Thanks Ryan! 00:16:00 Ryan S: link to a very good basic tutorial on simple features 00:53:58 Lydia Gibson: What is GPU? 00:54:23 Ryan S: GPU is the graphics processing unit (I think) 00:54:31 Lydia Gibson: Thank you 00:54:32 Ryan S: it's the part that "draws" on your screen 00:54:43 Lydia Gibson: Oh okay 00:54:56 Ryan S: versus the CPU that does calculations 00:55:48 SriRam: For 2D and non texture plots, I think it is more a RAM issue 00:55:59 Ryan Metcalf: Oh. I’m so sorry for using Acronyms! Ryan S. is correct. The balance I’m asking Federica is related….”Can I use a slow Laptop or do I have to use a super computer with massive Video card to render these types of graphical objects. 00:56:23 Lydia Gibson: I always thought CPU was synonymous with computer. 00:57:22 Ryan S: Ryan, just repurpose the GPUs you currently have that are mining crypto 00:57:38 Ryan Metcalf: :) Agreed!!! 00:57:57 SriRam: Lol 00:59:14 SriRam: If you have a spatial network, do not miss out on “sfnetworks” package 01:00:07 Federica Gazzelloni: https://kateto.net/netscix2016.html 01:00:14 Federica Gazzelloni: https://www.data-imaginist.com/2017/ggraph-introduction-layouts/ 01:00:24 Federica Gazzelloni: https://www.hcbravo.org/networks-across-scales/misc/tidygraph.nb.html 01:00:41 Federica Gazzelloni: https://igraph.org/r/doc/layout_with_drl.html 01:00:48 Federica Gazzelloni: https://tidygraph.data-imaginist.com/reference/index.html#section-misc 01:00:57 Federica Gazzelloni: https://ggraph.data-imaginist.com/articles/Layouts.html 01:01:52 Federica Gazzelloni: https://web.stanford.edu/class/bios221/book/Chap-Graphs.html https://github.com/jtichon/ModernStatsModernBioJGT/tree/master/data https://simplemaps.com/data/world-cities 01:07:00 SriRam: Tidy is I think , Hadley definition, variable is a column, sample point is a row 01:07:36 SriRam: Sorry my microphone does not work since a few sessions now 🙁 01:07:50 Ryan S: borders on "marketing" to some degree. :) 01:08:08 SriRam: Sfnetwork is not for graphs, it is more for spatial operations "],["annotations.html", "Chapter 7 Annotations", " Chapter 7 Annotations Learning Objectives Plot and Axis Titles; Providing context for the visual, and changing the look of plot elements and overall appearance Text Labels; mapping text from data or having text appear on graphs as data Building Custom Annotations; how to write summaries, context, arrows, and textual meta data to graphs Direct Labeling and Faceting; related packages for special issues such as higlighting, textboxes, html text "],["introduction-3.html", "7.1 Introduction", " 7.1 Introduction ] Packages - ggtext - ggtheme - gghighlight - palmerpenguins - ggrepel - grid Functions - geom_text - geom_label - theme(plot.title = element_text()) - geom = “curve” - geom_vline Resource - A ggplot Tutorial For Beautiful Plotting in R by Cedric Scherer August 5, 2019. Annotation Definitions “Conceptually, an annotation supplies metadata for the plot: that is, it provides additional information about the data being displayed. From a practical standpoint, however, metadata is just another form of data. Because of this, the annotation tools in ggplot2 reuse the same geoms that are used to create other plots.” Wickham, H., Navarro, N., & Lin Pedersen, T. (2016). Ggplot2: Elegant graphics for data analysis (Second ed.) Springer. “[Annotation] concerns judging the level of assistance an audience may require in order to understand the background, function and purpose of a project, as well as what guidance needs to be provided to help viewers perceive and interpret the data representations.” Kirk, Andy. Data Visualisation (p. 231). SAGE Publications. Kindle Edition. "],["plot-and-axis-titles.html", "7.2 Plot and Axis Titles", " 7.2 Plot and Axis Titles base <- ggplot(penguins, aes(bill_length_mm, bill_depth_mm, color = species, shape = species)) + geom_point(alpha = .4) + geom_point(data = gd, size = 4) + theme_bw() + labs( title = "How does Bill Size Differ by species?", subtitle = "Source: Palmer Station Antarctica LTER and K. Gorman, 2020", x = "*Length*", y = "Width", caption = "ggplot 2 Book Club") + theme(plot.title = element_text(color = "midnightblue", hjust = .5, face = "bold")) + theme(plot.subtitle = element_text(hjust = .5, size = 9)) + theme(axis.title.x = ggtext::element_markdown()) line breaks quote() for mathamatical expressions. ?plotmath removing labels two ways: labs(x = ““) and labs(x = NULL) "],["text-labels.html", "7.3 Text labels", " 7.3 Text labels 8.2 Text labels - geom_text() - geom_text() adds label text to the x and y coorindates of a graph such as name instead of a circle in a scatter plot. Change the font with the family aesthetic The packages showtext and extrafont can help with handling fonts across differnet devises Change the fontface aesthetic for plain, bold, or italic “faces”. Alignment: hjust (“left”, “center”, “right”, “inward”, “outward”) and vjust (“bottom”, “middle”, “top”, “inward”, “outward”) aesthetics. vjust = “inward”, hjust = “inward” ensures labels stay in the plot geom_text(aes(label = text), vjust = “inward”, hjust = “inward”) df <- data.frame(x = 1, y = 3:1, face = c("plain", "bold", "italic")) ggplot(df, aes(x, y)) + geom_text(aes(label = face, fontface = face, ), vjust = "inward", hjust = "inward", size = 20, angle = 10) base + geom_text(aes(label = body_mass_g), check_overlap = TRUE) base + geom_label(aes(label = body_mass_g)) ggplot(mpg, aes(displ, hwy)) + geom_text(aes(label = model)) + xlim(1, 8) ggplot(mpg, aes(displ, hwy)) + geom_text(aes(label = model)) + xlim(1, 8) ggplot(mpg, aes(displ, hwy)) + geom_text(aes(label = model), check_overlap = TRUE) + xlim(1, 8) library(ggrepel) ggplot(mpg, aes(displ, hwy)) + geom_text_repel(aes(label = model)) + xlim(1, 8) label <- data.frame( waiting = c(55, 80), eruptions = c(2, 4.3), label = c("peak one", "peak two") ) ggplot(faithfuld, aes(waiting, eruptions)) + geom_tile(aes(fill = density)) + geom_label(data = label, aes(label = label)) geom_label "],["annotations-1.html", "7.4 Annotations", " 7.4 Annotations 8.3 Annotations - ggplot2 annotation options - geom_text and geom_label geom_rect() geom_line(), geom_path(), geom_segment(), arrow() geom_vline(), geom_hline(), geom_abline() annotate() which can be used in combination with arrow() base + annotate( geom = "text", x = 42, y = 20, label = "The Adelie species is on all 3 islands", size = 5, color = "darkcyan") Arrows Code base + annotate( geom = "curve", x = 53, y = 20, xend = 49, yend = 18.5, curvature = .3, size = 1, arrow = arrow(length = unit(3, "mm")) ) + annotate(geom = "text", x = 53.1, y = 20, label = "Average Chinstrap", hjust = "left", size = 4, color = "darkcyan") + annotate( geom = "curve", x = 35, y = 20, xend = 38, yend = 18.5, curvature = .3, size = 1, arrow = arrow(length = unit(3, "mm")) ) + annotate(geom = "text", x = 32, y = 20.3, label = "Average Adelie", hjust = "left", size = 4, color = "darkcyan") + annotate( geom = "curve", x = 53, y = 15, xend = 48, yend = 15, curvature = .3, size = 1, arrow = arrow(length = unit(3, "mm")) ) + annotate(geom = "text", x = 53, y = 15.3, label = "Average Gentoo", hjust = "left", size = 4, color = "darkcyan") Arrows Plot base + annotate( geom = "curve", x = 53, y = 20, xend = 49, yend = 18.5, curvature = .3, size = 1, arrow = arrow(length = unit(3, "mm")) ) + annotate(geom = "text", x = 53.1, y = 20, label = "Average Chinstrap", hjust = "left", size = 4, color = "darkcyan") + annotate( geom = "curve", x = 35, y = 20, xend = 38, yend = 18.5, curvature = .3, size = 1, arrow = arrow(length = unit(3, "mm")) ) + annotate(geom = "text", x = 32, y = 20.3, label = "Average Adelie", hjust = "left", size = 4, color = "darkcyan") + annotate( geom = "curve", x = 53, y = 15, xend = 48, yend = 15, curvature = .3, size = 1, arrow = arrow(length = unit(3, "mm")) ) + annotate(geom = "text", x = 53, y = 15.3, label = "Average Gentoo", hjust = "left", size = 4, color = "darkcyan") + theme(legend.position = "none") astronauts %>% filter(nationality %in% c("U.S.","Australia", "U.K.", "U.S.S.R/Russia", "Japan")) %>% ggplot(aes(x = nationality, y = hours_mission, color = hours_mission)) + coord_flip() + geom_point(size = 4, alpha = 0.15) + geom_boxplot(color = "gray60", outlier.alpha = 0) + stat_summary(fun = mean, geom = "point", size = 5, color = "dodgerblue") + annotate( geom = "curve", x = 3.8, y = 2500, xend = 4, yend = 650, curvature = .3, arrow = arrow(length = unit(2, "mm")) ) + annotate( "text", x = 3.7, y = 2500, label = "The U.S. Mean Hours Mission", size = 2.7) + annotate( geom = "curve", x = 4.7, y = 4200, xend = 5, yend = 2800, curvature = .3, arrow = arrow(length = unit(2, "mm")) ) + annotate( "text", x = 4.5, y = 3700, label = "The interquartile range, between 25% and 75% of values", size = 2.8) + annotate( geom = "curve", x = 1, y = 3800, xend = 1, yend = 900, curvature = .3, arrow = arrow(length = unit(2, "mm")) ) + annotate( "text", x = .8, y = 3000, label = "Australian Astronaut Andrew S. W. Thomas completed missions in 1983, 1998, 2001, 2005 and is now retired", size = 2.8) + scale_color_viridis_c() + scale_y_continuous(limits = c(0, 5000)) + labs(title = "Length of Astronaut Missions in hours", subtitle = "A Study was conducted on the effects of space on various individuals", caption = "Source: TidyTuesday 2020 week 29 \\n inspired by plots in The Evolution of a ggplot (ep1) by Cedric Scherer") + theme_fivethirtyeight() + theme(legend.position = "none") + theme(plot.title = element_text(hjust = .5)) + theme(plot.subtitle = element_text(hjust = .5)) "],["directlabels-package.html", "7.5 Directlabels Package", " 7.5 Directlabels Package Place labels closer to the data than legends ggforce() gghighlight() Base Code Nurse Salary library(ggthemes) library(scales) library(ggthemes) library(scales) g <- nurses %>% group_by(year) %>% filter(state %in% c("Minnesota", "Wisconsin", "Iowa", "North Dakota", "Illinois", "Indiana", "Kansas", "Michigan", "Missouri", "Nebraska", "Ohio")) %>% ggplot(aes(year, annual_salary_median, color = state, )) + geom_line() + labs( title = "Annual Median RN Salary by Midwestern State" ) + theme(legend.position = "none") + geom_vline(xintercept = c(2007, 2009), size = 1.5, color = "darkgoldenrod1", linetype = "dashed") + gghighlight::gghighlight(state == c("Minnesota", "Wisconsin", "Iowa")) + theme_economist() + scale_color_economist(name = NULL) + theme(axis.title = element_blank()) + scale_y_continuous(labels = comma_format()) gghighlight and facets base + gghighlight::gghighlight() + facet_wrap(~ species) examples in geom_richtext library(ggtext) lab_html <- "★ geom_richtext can modify with hmtl" g + geom_richtext(aes(x = 2010, y = 50000, label = lab_html), stat = "unique", angle = 30, color = "white", fill = "steelblue") geom_textbox lab_long <- "**The Great Recession** <br><b style='font-size:10pt;color:steelblue;'> Minnesota's RN Annual Salaries increased during the great receision and then completely flatted out before rising again after 2015" g + geom_textbox(aes(x = 2015, y = 40000, label = lab_long), width = unit(15, "lines"), stat = "unique") "],["faceting-annotations.html", "7.6 Faceting Annotations", " 7.6 Faceting Annotations g + facet_wrap(~state, scales = "free_x") Grid package scales coordinates between 0 and 1 library(grid) my_grob <- grobTree(textGrob("Great Recession", x = .2, y = .9, hjust = 0, gp = gpar(col = "black", fontsize = 10, fontface = "bold"))) g + annotation_custom(my_grob) + facet_wrap(~state, scales = "free_x") "],["resources-1.html", "7.7 Resources", " 7.7 Resources ggplot 2 book chapter 8 annotations A ggplot Tutorial For Beautiful Plotting in R by Cedric Scherer August 5, 2019 The Evolution of a ggplot (EP.1) by Cedric Scherer Introduction to gghighlight by Hiroaki Yutani 2021-06-05 "],["meeting-videos-7.html", "7.8 Meeting Videos", " 7.8 Meeting Videos 7.8.1 Cohort 1 Meeting chat log 00:10:14 Ed: Hi everyone. My connection is shaky so if I drop off don’t take it personally. 😇 00:10:32 Michael Haugen: Thanks for joining us! 00:10:42 Ryan Metcalf: Great to see you. No worries at all. 00:24:25 Ryan Metcalf: To support Michael’s quote, I mentioned a Swedish Statician…Hans Rosling. The Gapminder project was his brain child. Great Ted Talks were delivered by the user: https://www.ted.com/speakers/hans_rosling 00:32:42 June Choe: re: text/font rendering - {ragg} + {systemfonts} is now recommended over {showtext}/{extrafont}! 00:32:59 June Choe: https://yjunechoe.github.io/posts/2021-06-24-setting-up-and-debugging-custom-fonts/ 00:33:39 Federica Gazzelloni: @June thanks 00:34:28 June Choe: here's some quotes from Thomas Lin Pedersen (ggplot2 dev) on showtext/extrafont - https://twitter.com/thomasp85/status/1355083725156077571 https://twitter.com/thomasp85/status/1261539815960518656 00:39:31 Ed: So is it necessary to hard code the locations for those arrows? It won't stop them where it makes sense to go? 00:39:46 Ed: What about different resolution screens, etc. 00:41:36 Kent Johnson: Yes, you have to hard-code the arrow start and end. 00:42:09 Ed: 👍 00:42:46 Kent Johnson: My experience is, it's pretty fiddly to get something really nice. I don't know how plot size / screen resolution affect the arrows. 00:43:42 Ryan Metcalf: https://fivethirtyeight.com/ 00:46:42 June Choe: linewidth and arrow size would be subject to resolution but not the stard/end points 00:47:07 June Choe: start/end points are converted to native coordinate units but size is absolute 00:47:46 Ed: 👍 00:48:03 June Choe: (which is why you should never rely just on plot panel output and always use something like ggsave!) 00:48:58 Ed: Awesome tip. Could see myself getting frustrated but good to know going into it. 00:49:35 June Choe: since like an update or two ago, ggsave() started returning the path to the saved image invisibly, so if you 00:50:07 June Choe: if you're on windows, you can do something like `system2("open", ggsave("img.png"))` and itll open up the plot after saving it 00:50:27 June Choe: (open it back up using your system's default photo viewing app) 00:58:21 Ryan Metcalf: Sheesh! This took me forever to find! I mentioned Arrows outside of a graphic. I was using it with D3 objects (similar to ggplot2). https://github.com/krispo/yarrow 01:01:04 June Choe: big fan - and you should check out {sinab} as well for a more powerful version of ggtext by the same dev (though this one's heavily experimental and requires Rust) - https://clauswilke.com/sinab/ 01:01:18 Michael Haugen: thanks 01:03:14 June Choe: the 0-1 coord scale in grid here is called "npc" (Normalized Parent Coordinates) 01:04:21 Ryan Metcalf: June, you are a wealth of knowledge! 🙂I may ping you outside of Zoom (Slack) for further discussions on Graphical Objects. 01:05:00 Ryan S: Awesome job Michael! 01:05:12 June Choe: For sure @Ryan ! Always happy to talk about data viz 01:05:15 June Choe: and thanks for presenting Michael! 01:05:50 June Choe: xaringanExtra i think 01:06:22 June Choe: https://pkg.garrickadenbuie.com/xaringanExtra/#/extra-styles 01:07:31 Federica Gazzelloni: Thanks Michael "],["arranging-plots.html", "Chapter 8 Arranging Plots", " Chapter 8 Arranging Plots Learning Objectives Produce several subplots part of the same main visualization A range of packages for providing different approaches to arranging separate plots "],["introduction-4.html", "8.1 Introduction", " 8.1 Introduction This chapter focuses on making more than one plot in one visualization, using the following packages: patchwork cowplot gridExtra ggpubr "],["arranging-plots-side-by-side-with-no-overlap.html", "8.2 Arranging plots side by side with no overlap", " 8.2 Arranging plots side by side with no overlap 8.2.1 Taking control of the layout More compositions: 8.2.2 More about layouts This way is possible a custom modification of the theme for one plot or for both. 8.2.3 Plot annotations "],["arranging-plots-on-top-of-each-other.html", "8.3 Arranging plots on top of each other", " 8.3 Arranging plots on top of each other It is possible to arrange plots in a way that they are nested to each other, as well as setting the position inside the main plot. General options are left, right, top, and bottom locations, but more specific locations can be set, such as using: grid::unit() (default uses npc units which goes from 0 to 1) In addition, the location is by default set to the panel area, but can be align_to` plot area. An inset can be placed exactly 15 mm from the top right corner. "],["extra.html", "8.4 Extra", " 8.4 Extra grid and gridExtra packages cowplot package To add a common title we use `ggdraw() ggpubr package "],["conclusions-1.html", "8.5 Conclusions", " 8.5 Conclusions Patchwork - imaginist is one of the packages mentioned in the book, also some other packages provide same results with different approaches. 8.5.1 Extra resources: grid and gridExtra cowplot ggpubr "],["meeting-videos-8.html", "8.6 Meeting Videos", " 8.6 Meeting Videos 8.6.1 Cohort 1 Meeting chat log 00:27:47 Lydia Gibson: What are npc unts? 00:27:56 Michael Haugen: "npc" (Normalized Parent Coordinates) 00:28:02 Michael Haugen: 0 to 1 00:28:07 Lydia Gibson: Oh okay. Thank you! 00:28:16 Michael Haugen: Same thing that was used for faceting annotations. 00:28:43 Michael Haugen: so .8 is l80 percent of the way up the y axis for example. 00:28:47 Lydia Gibson: I missed annotations last week. I’ll have to go back and watch the session. 00:43:21 SriRam: I use patch and cowplot 00:50:00 Kent Johnson: https://www.cedricscherer.com/2019/08/05/a-ggplot2-tutorial-for-beautiful-plotting-in-r/ 00:50:09 Lydia Gibson: Thank you! 00:50:17 Michael Haugen: The arrows are in chapter 8.3 with geom and curve for example, annotate( geom = "curve", x = 4, y = 35, xend = 2.65, yend = 27, curvature = .3, arrow = arrow(length = unit(2, "mm")) ) + 00:51:16 Michael Haugen: and arrows came up in the discussion as a discussion of the GROB and arrows and how to render your plot so the arrows are not distorted. 00:51:16 Ryan Metcalf: Perfect! “arrow” was the argument I was after! 00:51:37 Michael Haugen: And then we talked about ggsave as a part of that 00:53:32 Michael Haugen: we all will be at Cedric’s level by the end of this bookclub right? 00:53:47 SriRam: :D 00:53:52 Lydia Gibson: Hopefully lol 00:54:20 Ryan S: Thank you! 00:54:29 SriRam: Thank you Meeting chat log 00:04:54 June Choe: https://yjunechoe.github.io/ggtrace-talk/ 00:12:16 Ryan S: brilliant... didn't know this before but really simplifies the concept 00:14:08 Michael Haugen: Makes sense 00:26:15 June Choe: https://ggplot2.tidyverse.org/reference/aes_eval.html 00:26:36 Ryan Metcalf: I’m thinking in the context….I buy a car. The engineers have optimized it for longevity….but I want a hot rod….So I need to open the hood and change parts. Or, access the computer and start changing parameters. 00:32:54 SriRam: This is like scuba diving, more beautiful under the surface :) 00:33:15 Stan Piotrowski: Great analogy, SriRam! 00:33:17 Ryan Metcalf: Completely agree @SriRam! 00:36:44 June Choe: ggplot2:::ggplot_build.ggplot 00:37:58 June Choe: ggplot2:::print.ggplot 00:50:22 Federica Gazzelloni: thanks June!!! 00:50:54 SriRam: Out of curiosity, how much of this trickery (internal functions) can be learnt from "advanced R" or are these mentioned in the ggplot book ? I am just a regular user, I may not go this deep, but looks very interesting to explore/read during the Christmas break 00:51:12 Stan Piotrowski: I’m in the same boat as SriRam 00:51:32 Stan Piotrowski: Curious to know more about this but can definitely see myself getting lost in a rabbit hole 00:54:23 Ryan S: at some point -- maybe a different session -- can we dive deep into the different stat options ("identity", "count", etc.) 00:54:47 Ryan S: specifically, what do they do and when would you use them 00:55:09 Ryan Metcalf: June, this is amazing! 00:57:27 SriRam: Countdown starts...... 5 mins to come back to reality !!! :D 01:02:21 Stan Piotrowski: Great talk, June! 01:02:34 Kent Johnson: Thank you! See you next week! "],["position-scales-and-axes.html", "Chapter 9 Position scales and axes", " Chapter 9 Position scales and axes Learning Objectives What are the defining components of a scale? When/why does the data need to be transformed for a visualization? What are the defining components of an axis? What is the relationship between scale and axis? "],["introduction-preliminaries-asides.html", "9.1 Introduction / preliminaries / asides", " 9.1 Introduction / preliminaries / asides This chapter introduces position scales and axes. It may also be helpful to understand position scales and axes as position scales and guides, because axes they share the same API as guides for non-positional scales like color legends. The parallel will be clearer in the next chapter. It’s worthwhile to read documentations of the {scales} package to learn more about scales, since that handles a lot of the (re-)scaling and transformation under the hood. It may be good to start with the rstudio::conf2020 talk on scales. It should also be noted that there’s some discussion about revamping the scales_* API. See issue #4269 and PR #4271 Lastly, a small aside on the book’s after_stat() example it he intro, continuing nicely from our discussion on ggplot internals last week. ## [1] "StatBin" ## Aesthetic mapping: ## * `x` -> `after_stat(count)` ## * `y` -> `after_stat(count)` ## * `weight` -> 1 ## [1] "x|y" ## Aesthetic mapping: ## * `x` -> `displ` "],["numeric.html", "9.2 10.1 Numeric", " 9.2 10.1 Numeric 9.2.1 10.1.1 Limits The book doesn’t have content for this section (??) But we know that you can set limits with xlim()/ylim() or scale_x|y_*(limits = ) 9.2.2 10.1.2 Out of bounds values NOTE: A big theme of the {scales} package as of v1.1.1 (May 2020) is that they have very transparent function names. For example, the family of functions for Out Of Bounds (oob) handling are all named oob_*(). This is an intentional (re-)design of the package to work nicely with autocomplete. ## [1] "oob_censor" "oob_censor_any" "oob_discard" ## [4] "oob_keep" "oob_squish" "oob_squish_any" ## [7] "oob_squish_infinite" By default, data outside scales are set to NA. This is because the oob argument is set to oob_censor()/censor(). Note that oob only applies to continuous scales, since values of a discrete scale form a fixed set. ## { ## call <- caller_call() ## if (scale_override_call(call)) { ## call <- current_call() ## } ## sc <- continuous_scale(ggplot_global$x_aes, palette = identity, ## name = name, breaks = breaks, n.breaks = n.breaks, minor_breaks = minor_breaks, ## labels = labels, limits = limits, expand = expand, oob = oob, ## na.value = na.value, transform = transform, trans = trans, ## guide = guide, position = position, call = call, super = ScaleContinuousPosition) ## set_sec_axis(sec.axis, sc) ## } ## censor Book’s examples: Equivalent solutions with oob_*() You can use oob functions for non-positional scales 9.2.3 10.1.3 Visual range expansion Book examples: With expansion() from v3.3.0 (Dec 2020) ## $mult ## [1] 0 ## ## $add ## [1] 0 9.2.4 10.1.4 Exercises 9.2.5 10.1.5 Breaks ## [1] "breaks_extended" "breaks_hms" "breaks_log" "breaks_pretty" ## [5] "breaks_timespan" "breaks_width" Book example: ## const up txt big log ## 1 1 1 a 1000 2 ## 2 1 2 b 2000 5 ## 3 1 3 c 3000 10 ## 4 1 4 d 4000 2000 Demo from {scales}: ## scale_x_continuous(breaks = scales::breaks_extended()) ## scale_x_continuous(breaks = scales::breaks_extended(n = 2)) ## scale_x_continuous(NULL) At the vector level: ## [1] 1000 2000 3000 4000 ## [1] 1000 4000 Other breaks: ## [1] 0 25 50 75 100 ## [1] 0 10 20 30 40 50 60 70 80 90 100 110 ## [1] 0 20 40 60 80 100 120 ## [1] 1 10 100 1000 Debugging arguments in scale_*() that take function factories 9.2.6 10.1.6 Minor breaks Book example: ## [1] 1 2 3 4 5 6 7 8 9 10 20 30 ## [13] 40 50 60 70 80 90 100 200 300 400 500 600 ## [25] 700 800 900 1000 2000 3000 4000 5000 6000 7000 8000 9000 ## [37] 10000 There are also minor break functions: ## [1] "minor_breaks_n" "minor_breaks_width" 9.2.7 10.1.7 Labels ## [1] "label_bytes" "label_comma" "label_currency" ## [4] "label_date" "label_date_short" "label_dollar" ## [7] "label_log" "label_math" "label_number" ## [10] "label_number_auto" "label_number_si" "label_ordinal" ## [13] "label_parse" "label_percent" "label_pvalue" ## [16] "label_scientific" "label_time" "label_timespan" ## [19] "label_wrap" Book examples: 9.2.8 10.1.8 Exercises 9.2.9 10.1.9 Transformations Book example: The transformation is carried out by a “transformer”, which describes the transformation, its inverse, and how to draw the labels. You can construct your own transformer using scales::trans_new() Case study: make reversed log x-axis ## Transformer: log-10 [1e-100, Inf] ## Transformer: reverse [-Inf, Inf] ## $name ## ## ## $transform ## ## ## $inverse ## ## ## $d_transform ## NULL ## ## $d_inverse ## NULL ## ## $breaks ## extended_breaks() ## ## $minor_breaks ## regular_minor_breaks() ## ## $format ## format_format() ## ## $domain ## c(-Inf, Inf) Regardless of which method you use, the transformation occurs before any statistical summaries. To transform after statistical computation use coord_trans() From the docs: Example where stat transformation matters: ## x ymin ymax ymin_final ymax_final ## 1 1 12 28 12 28 ## 2 2 22 33 17 44 ## 3 3 15 26 15 26 ## x ymin ymax ymin_final ymax_final ## 1 1 1.079181 1.447158 1.079181 1.447158 ## 2 2 1.361728 1.531479 1.230449 1.643453 ## 3 3 1.176091 1.414973 1.176091 1.414973 ## x ymin ymax ymin_final ymax_final ## 1 1 12 28 12 28 ## 2 2 22 33 17 44 ## 3 3 15 26 15 26 9.2.10 ASIDE - A little more on transformations transform() method of the Scales ggproto: transform() Transforms a vector of values using self$trans. This occurs before the Stat is calculated. Transformation changes the layer data ## const up txt big log ## 1 1 1 a 1000 2 ## 2 1 2 b 2000 5 ## 3 1 3 c 3000 10 ## 4 1 4 d 4000 2000 ## x y PANEL group shape colour size fill alpha stroke ## 1 -1000 1 1 1 19 black 1.5 NA NA 0.5 ## 2 -2000 2 1 2 19 black 1.5 NA NA 0.5 ## 3 -3000 3 1 3 19 black 1.5 NA NA 0.5 ## 4 -4000 4 1 4 19 black 1.5 NA NA 0.5 ## function () ## { ## new_transform("reverse", function(x) -x, function(x) -x, ## d_transform = function(x) rep(-1, length(x)), d_inverse = function(x) rep(-1, ## length(x)), minor_breaks = regular_minor_breaks(reverse = TRUE)) ## } ## <bytecode: 0x55d92b830ed0> ## <environment: namespace:scales> ## List of 9 ## $ name : chr "reverse" ## $ transform :function (x) ## $ inverse :function (x) ## $ d_transform :function (x) ## $ d_inverse :function (x) ## $ breaks :function (x, n = n_default) ## $ minor_breaks:function (b, limits, n) ## $ format :function (x) ## $ domain : num [1:2] -Inf Inf ## - attr(*, "class")= chr "transform" ## [1] -1000 -2000 -3000 -4000 ## [1] 1000 2000 3000 4000 ## [1] "1000" "2000" "3000" "4000" Most useful for positioning purposes (ex: time_trans()) ## [1] 953553600 953557200 953560800 953564400 953568000 953571600 953575200 ## [8] 953578800 953582400 953586000 ## [1] "2000-03-20 12:00:00 UTC" "2000-03-20 13:00:00 UTC" ## [3] "2000-03-20 14:00:00 UTC" "2000-03-20 15:00:00 UTC" ## [5] "2000-03-20 16:00:00 UTC" "2000-03-20 17:00:00 UTC" ## [7] "2000-03-20 18:00:00 UTC" "2000-03-20 19:00:00 UTC" ## [9] "2000-03-20 20:00:00 UTC" "2000-03-20 21:00:00 UTC" ## [1] "12:00" "15:00" "18:00" "21:00" ## x y PANEL group shape colour size fill alpha stroke ## 1 953553600 0 1 -1 19 black 1.5 NA NA 0.5 ## 2 953557200 0 1 -1 19 black 1.5 NA NA 0.5 ## 3 953560800 0 1 -1 19 black 1.5 NA NA 0.5 ## 4 953564400 0 1 -1 19 black 1.5 NA NA 0.5 ## 5 953568000 0 1 -1 19 black 1.5 NA NA 0.5 ## 6 953571600 0 1 -1 19 black 1.5 NA NA 0.5 ## 7 953575200 0 1 -1 19 black 1.5 NA NA 0.5 ## 8 953578800 0 1 -1 19 black 1.5 NA NA 0.5 ## 9 953582400 0 1 -1 19 black 1.5 NA NA 0.5 ## 10 953586000 0 1 -1 19 black 1.5 NA NA 0.5 "],["date-time.html", "9.3 10.2 Date-time", " 9.3 10.2 Date-time 9.3.1 10.2.1 Breaks Book example: Making it explicit: Book example: ## [1] "1900-01-01" "1925-01-01" "1950-01-01" "1975-01-01" "2000-01-01" Using offset argument (unit = days): ## [1] "1900-02-01" "1925-02-01" "1950-02-01" "1975-02-01" "2000-02-01" Calculating the offset: ## Time difference of 31 days 9.3.2 10.2.2 Minor breaks Book examples: In the second plot, the major and minor beaks follow slightly different patterns: the minor breaks are always spaced 7 days apart but the major breaks are 1 month apart. Because the months vary in length, this leads to slightly uneven spacing. Explicit: 9.3.3 10.2.3 Labels Book examples: "],["discrete.html", "9.4 10.3 Discrete", " 9.4 10.3 Discrete Book examples: 9.4.1 10.3.1 Limits For discrete scales, limits should be a character vector that enumerates all possible values. Censors missing categories in the set: Adds new categories without value: Same effect with drop = FALSE with unused factor levels It drops unused factor levels by default, though 9.4.2 10.3.2 Scale labels 9.4.3 10.3.2 Scale labels Book example: Debugging strategy 9.4.4 10.3.3 guide_axis() Book examples: More guides in {ggh4x} - https://teunbrand.github.io/ggh4x/ "],["binned.html", "9.5 10.4 Binned", " 9.5 10.4 Binned Book example: "],["aside---geom_sf-limits.html", "9.6 ASIDE - geom_sf() + limits", " 9.6 ASIDE - geom_sf() + limits 9.6.1 Example from Twitter: https://twitter.com/Josh_Ebner/status/1470818469801299970?s=20 9.6.2 Reprexes from Ryan S: ## # A tibble: 6 × 2 ## x_coord y_coord ## <dbl> <dbl> ## 1 1 1 ## 2 1 2 ## 3 2 1 ## 4 3 2 ## 5 6 5 ## 6 1 1 Full range polygon Polygon with limits Path with limits geom_sf() without limits geom_sf() with limits 9.6.3 Further exploration Using geom_sf() adds CoordSF by default ## [1] "CoordSf" "CoordCartesian" "Coord" "ggproto" ## [5] "gg" ## [1] "CoordSf" "CoordCartesian" "Coord" "ggproto" ## [5] "gg" In fact, geom_sf() must be used with coord_sf() ## Error in `geom_sf()`: ## ! Problem while converting geom to grob. ## ℹ Error occurred in the 1st layer. ## Caused by error in `draw_panel()`: ## ! `geom_sf()` can only be used with `coord_sf()`. The underlying geometry is untouched (indicating that limits are not removing data) ## geometry PANEL group xmin xmax ymin ymax linetype alpha ## 1 POLYGON ((1 1, 1 2, 2 1, 3 ... 1 -1 1 6 1 5 1 NA ## stroke ## 1 0.5 ## geometry PANEL group xmin xmax ymin ymax linetype alpha ## 1 POLYGON ((1 1, 1 2, 2 1, 3 ... 1 -1 1 NA 1 5 1 NA ## stroke ## 1 0.5 ## [1] TRUE OOB handling inside scale_x|y_continuous() cannot override the behavior Instead, coord_sf(lims_method = ) offers other spatial-specific methods. Censor doesn’t seem to be one but an option like \"geometry_bbox\" automatically sets limits to the smallest bounding box that contain all geometries. Interesting note from the docs: … specifying limits via position scales or xlim()/ylim() is strongly discouraged, as it can result in data points being dropped from the plot even though they would be visible in the final plot region. 9.6.4 Internals Scale censor for geom_polygon() Scale censor for geom_sf() Inspecting the rendered geom with layer_grob() ## # A tibble: 6 × 2 ## x y ## <simplUnt> <simplUnt> ## 1 0.04545455native 0.04545455native ## 2 0.04545455native 0.2727273native ## 3 0.2272727native 0.04545455native ## 4 0.4090909native 0.2727273native ## 5 0.9545455native 0.9545455native ## 6 0.04545455native 0.04545455native ## # A tibble: 6 × 2 ## x y ## <simplUnt> <simplUnt> ## 1 0.04545455native 0.04545455native ## 2 0.04545455native 0.2727273native ## 3 0.3484848native 0.04545455native ## 4 0.6515152native 0.2727273native ## 5 1.560606native 0.9545455native ## 6 0.04545455native 0.04545455native "],["meeting-videos-9.html", "9.7 Meeting Videos", " 9.7 Meeting Videos 9.7.1 Cohort 1 Meeting chat log 00:59:06 June Choe: There's also a nice animation from wikipedia (the cylinder is squished because of perceptual inequality between hues) - https://upload.wikimedia.org/wikipedia/commons/transcoded/8/8d/SRGB_gamut_within_CIELCHuv_color_space_mesh.webm/SRGB_gamut_within_CIELCHuv_color_space_mesh.webm.480p.vp9.webm "],["colour-scales-and-legends.html", "Chapter 10 Colour Scales and Legends", " Chapter 10 Colour Scales and Legends Learning Objectives Learn how to map values to colours in ggplot2 Learn about colour theory (a more detailed exposition is available online at http://tinyurl.com/clrdtls) "],["a-little-colour-theory.html", "10.1 A little colour theory", " 10.1 A little colour theory There have been many attempts to come up with colours spaces that are more perceptually uniform. We’ll use a modern attempt called the HCL colour space, which has three components of hue, chroma and luminance: -Hue ranges from 0 to 360 (an angle) and gives the “colour” of the colour (blue, red, orange, etc). -Chroma is the “purity” of a colour, ranging from 0 (grey) to a maximum that varies with luminance. -Luminance is the lightness of the colour, ranging from 0 (black) to 1 (white). An additional complication is that many people (~10% of men) do not possess the normal complement of colour receptors and so can distinguish fewer colours than usual. In brief, it’s best to avoid red-green contrasts, and to check your plots with systems that simulate colour blindness. Visicheck (https://www.vischeck.com/vischeck/) is one online solution. Another alternative is the dichromat package34 which provides tools for simulating colour blindness, and a set of colour schemes known to work well for colour-blind people. You can also help people with colour blindness in the same way that you can help people with black-and-white printers: by providing redundant mappings to other aesthetics like size, line type or shape. 10.1.1 Colour blindness "],["continuous-colour-scales.html", "10.2 Continuous colour scales", " 10.2 Continuous colour scales Colour gradients are often used to show the height of a 2d surface. The plots in this section use the surface of a 2d density estimate of the faithful dataset which records the waiting time between eruptions and during each eruption for the Old Faithful geyser in Yellowstone Park. Any time I refer to scale_fill_() in this section there is a corresponding scale_colour_() for the colour aesthetic (or scale_color_*() if you prefer US spelling). 10.2.1 Particular pallettes There are multiple ways to specify continuous colour scales. You can use to construct your own palette, but it is unnecessary because there are many “hand picked” palettes available. Ggplot2 supplies two scale functions that bundle pre-specified palettes, scale_fill_viridis_c() and scale_fill_distiller(). The viridis scales are designed to be perceptually uniform in both colour and when reduced to black and white, and to be perceptible to people with various forms of colour blindness. The second group of continuous colour scales built in to ggplot2 are derived from the ColorBrewer scales: scale_fill_brewer() provides these colours as discrete palettes, while scale_fill_distiller() and scale_fill_fermenter() are the continuous and binned analogs. scale_fill_scico() provides palettes that are perceptually uniform and suitable for scientific visualisation A particularly useful package is paletteer which aims to provide a common interface. 10.2.2 Robust recipes The default scale for continuous fill scales is scale_fill_continuous() which in turn defaults to scale_fill_gradient(). As a consequence, these three commands produce the same plot using a gradient scale. Gradient scales provide a robust method for creating any colour scheme you like. You just specify two or more reference colours, and ggplot2 will interpolate linearly between them. Three functions that you can use for this purpose are *scale_fill_gradient() produces a two-colour gradient *scale_fill_gradient2() produces a three-colour gradient with specified midpoint *scale_fill_gradientn() produces an n-colour gradient The Munsell colour system provides an easy way of specifying colours based on their hue, chroma and luminance. The munsell package provides easy access to the Munsell colours, which can then be used to specify a gradient scale. For more information on the munsell package see https://github.com/cwickham/munsell/. Three-point gradient scales typically convey the perceptual impression that there is a natural midpoint (often a zero value) from which the other values diverge. The left plot below shows how to create a divergent “yellow/blue” scale. If you have colours that are meaningful for your data (e.g., black body colours or standard terrain colours), or you’d like to use a palette produced by another package, you may wish to use an n-point gradient. The middle and right plots below use the colorspace package. For more information on the colorspace package see https://colorspace.r-forge.r-project.org/. 10.2.3 Missing values All continuous colour scales have an na.value parameter that controls what colour is used for missing values (including values outside the range of the scale limits). By default it is set to grey, which will stand out when you use a colourful scale. If you use a black and white scale, you might want to set it to something else to make it more obvious. You can set na.value = NA to make missing values invisible, or choose a specific colour if you prefer: 10.2.4 Limits, breaks and labels You can suppress the breaks entirely by setting them to NULL. For axes, this removes the tick marks, grid lines, and labels; and for legends this removes the keys and labels. 10.2.5 Legends "],["discrete-colour-scales.html", "10.3 Discrete colour scales", " 10.3 Discrete colour scales Discrete colour and fill scales occur in many situations. A typical example is a barchart that encodes both position and fill to the same variable. The default scale for discrete colours is scale_fill_discrete() which in turn defaults to scale_fill_hue() so these are identical plots: 10.3.1 Brewer scales scale_colour_brewer() is a discrete colour scale that—along with the continuous analog scale_colour_distiller() and binned analog scale_colour_fermenter()—uses handpicked “ColorBrewer” colours taken from http://colorbrewer2.org/. These colours have been designed to work well in a wide variety of situations, although the focus is on maps and so the colours tend to work better when displayed in large areas. There are many different options: The first group of palettes are sequential scales that are useful when your discrete scale is ordered (e.g., rank data), and are available for continuous data using scale_colour_distiller(). For unordered categorical data, the palettes of most interest are those in the second group. ‘Set1’ and ‘Dark2’ are particularly good for points, and ‘Set2’, ‘Pastel1’, ‘Pastel2’ and ‘Accent’ work well for areas. Note that no palette is uniformly good for all purposes. Scatter plots typically use small plot markers, and bright colours tend to work better than subtle ones: Bar plots usually contain large patches of colour, and bright colours can be overwhelming. Subtle colours tend to work better in this situation: 10.3.2 Hue and grey scales The default colour scheme picks evenly spaced hues around the HCL colour wheel. This works well for up to about eight colours, but after that it becomes hard to tell the different colours apart. You can control the default chroma and luminance, and the range of hues, with the h, c and l arguments: One disadvantage of the default colour scheme is that because the colours all have the same luminance and chroma, when you print them in black and white, they all appear as an identical shade of grey. Noting this, if you are intending a discrete colour scale to be printed in black and white, it is better to use scale_fill_grey() which maps discrete data to grays, from light to dark: 10.3.3 Paleteer Scales 10.3.4 Manual scales If none of the hand-picked palettes is suitable, or if you have your own preferred colours, you can use scale_fill_manual() to set the colours manually. This can be useful if you wish to choose colours that highlight a secondary grouping structure or draw attention to different comparisons: You can also use a named vector to specify colors to be assigned to each level which allows you to specify the levels in any order you like: 10.3.5 Limits, breaks and labels 10.3.6 Legends "],["binned-colour-scales.html", "10.4 Binned colour scales", " 10.4 Binned colour scales Color scales also come in binned versions. The default scale is scale_fill_binned() which in turn defaults to scale_fill_steps(). These scales have an n.breaks argument that controls the number of discrete colour categories created by the scale. Counterintuitively—because the human visual system is very good at detecting edges—this can sometimes make a continuous colour gradient easier to perceive: In other respects scale_fill_steps() is analogous to scale_fill_gradient(), and allows you to construct your own two-colour gradients. There is also a three-colour variant scale_fill_steps2() and n-colour scale variant scale_fill_stepsn() that behave similarly to their continuous counterparts: A brewer analog for binned scales also exists, and is called scale_fill_fermenter(): Note that like the discrete scale_fill_brewer()—and unlike the continuous scale_fill_distiller()—the binned function scale_fill_fermenter() does not interpolate between the brewer colours, and if you set n.breaks larger than the number of colours in the palette a warning message will appear and some colours will not be displayed. 10.4.1 Limits, breaks and labels 10.4.2 Legends "],["date-time-colour-scales.html", "10.5 Date Time Colour Scales", " 10.5 Date Time Colour Scales When a colour aesthetic is mapped to a date/time type, ggplot2 uses scale_colour_date() or scale_colour_datetime() to specify the scale. These are designed to handle date data, analogous to the date scales discussed in Section 10.2. These scales have date_breaks and date_labels arguments that make it a little easier to work with these data, as the slightly contrived example below illustrates: "],["alpha-scales.html", "10.6 Alpha scales", " 10.6 Alpha scales Alpha scales map the transparency of a shade to a value in the data and can be a convenient way to visually down-weight less important observations. scale_alpha() is an alias for scale_alpha_continuous() since that is the most common use of alpha, and it saves a bit of typing. "],["legend-position.html", "10.7 Legend position", " 10.7 Legend position A number of settings that affect the overall display of the legends are controlled through the theme system. You’ll learn more about that in Section 18.2, but for now, all you need to know is that you modify theme settings with the theme() function. The position and justification of legends are controlled by the theme setting legend.position, which takes values “right”, “left”, “top”, “bottom”, or “none” (no legend). Switching between left/right and top/bottom modifies how the keys in each legend are laid out (horizontal or vertically), and how multiple legends are stacked (horizontal or vertically). If needed, you can adjust those options independently: legend.direction: layout of items in legends (“horizontal” or “vertical”). legend.box: arrangement of multiple legends (“horizontal” or “vertical”). legend.box.just: justification of each legend within the overall bounding box, when there are multiple legends (“top”, “bottom”, “left”, or “right”). Alternatively, if there’s a lot of blank space in your plot you might want to place the legend inside the plot by setting legend.position to a numeric vector of length two. The numbers represent a relative location in the panel area: c(0, 1) is the top-left corner and c(1, 0) is the bottom-right corner. You control which corner of the legend the legend.position refers to with legend.justification, which is specified in a similar way. Unfortunately positioning the legend exactly where you want it requires a lot of trial and error. "],["meeting-videos-10.html", "10.8 Meeting Videos", " 10.8 Meeting Videos 10.8.1 Cohort 1 Meeting chat log 00:59:06 June Choe: There's also a nice animation from wikipedia (the cylinder is squished because of perceptual inequality between hues) - https://upload.wikimedia.org/wikipedia/commons/transcoded/8/8d/SRGB_gamut_within_CIELCHuv_color_space_mesh.webm/SRGB_gamut_within_CIELCHuv_color_space_mesh.webm.480p.vp9.webm Meeting chat log 00:12:21 June Choe: BTW as of April 2021 v0.6.0 {viridis} got 3 more color palettes -- mako, rocket, and turbo --- https://sjmgarnier.github.io/viridis/articles/intro-to-viridis.html 00:20:41 June Choe: "for legends this removes the keys and labels" i guess? 00:22:34 June Choe: scale_fill_hue in turn uses scales::hue_pal(), if you want to use the default discrete color palette - https://scales.r-lib.org/reference/hue_pal.html 00:30:43 Federica Gazzelloni: really like this one: https://colorspace.r-forge.r-project.org/ 00:35:02 Michael Haugen: https://github.com/rfordatascience/tidytuesday 00:35:56 Michael Haugen: When I have accessed data from TT I have usually read them in manually. 00:36:04 Michael Haugen: for example: starbucks <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-12-21/starbucks.csv') 00:46:06 Ryan Metcalf: PodCast Page: https://www.tidytuesday.com/ 00:50:16 Michael Haugen: Ryan Almost took down TidyTuesday 00:50:46 Michael Haugen: Make sure to commit to main 00:50:52 Ryan S: lol "],["other-aesthetics.html", "Chapter 11 Other Aesthetics", " Chapter 11 Other Aesthetics Learning objectives: To learn about several other aesthetics that ggplot2 can use to represent data, including: size scales shape scales line type scales manual scales identity scales "],["size.html", "11.1 Size", " 11.1 Size The size aesthetic is typically used to scale points and text. The default scale for size aesthetics is scale_size() in which a linear increase in the variable is mapped onto a linear increase in the area (not the radius) of the geom. There are several size scales: scale_size_area() and scale_size_binned_area() are versions of scale_size() and scale_size_binned() that ensure that a value of 0 maps to an area of 0. scale_radius() maps the data value to the radius rather than to the area (Section 12.1.1). scale_size_binned() is a size scale that behaves like scale_size() but maps continuous values onto discrete size categories, analogous to the binned position and colour scales discussed in Sections 10.4 and 11.4 respectively. Legends associated with this scale are discussed in Section 12.1.2. scale_size_date() and scale_size_datetime() are designed to handle date data, analogous to the date scales discussed in Section 10.2. 11.1.1 Radius size scales There are situations where area scaling is undesirable, and for such situations scale_radius() may be more appropriate. For example, consider a data set containing astronomical data that includes the radius of different planets: ## name type position radius orbit ## 1 Mercury Inner 1 2440 57900000 ## 2 Venus Inner 2 6052 108200000 ## 3 Earth Inner 3 6378 149600000 ## 4 Mars Inner 4 3390 227900000 ## 5 Jupiter Outer 5 71400 778300000 ## 6 Saturn Outer 6 60330 1427000000 ## 7 Uranus Outer 7 25559 2871000000 ## 8 Neptune Outer 8 24764 4497100000 11.1.2 Binned size scales Binned size scales work similarly to binned scales for colour and position aesthetics (Sections 11.4 and 10.4) with the exception of how legends are displayed. The default legend for a binned size scale, and all binned scales except position and colour aesthetics, is governed by guide_bins(). For instance, in the mpg data we could use scale_size_binned() to create a binned version of the continuous variable hwy: Unlike guide_legend(), the guide created for a binned scale by guide_bins() does not organize the individual keys into a table. Instead they are arranged in a column (or row) along a single vertical (or horizontal) axis, which by default is displayed with its own axis. The important arguments to guide_bins() are listed below: axis indicates whether the axis should be drawn (default is TRUE) direction is a character string specifying the direction of the guide, either “vertical” (the default) or “horizontal” show.limits specifies whether tick marks are shown at the ends of the guide axis (default is FALSE) axis.colour, axis.linewidth and axis.arrow are used to control the guide axis that is displayed alongside the legend keys keywidth, keyheight, reverse and override.aes have the same behavior for guide_bins() as they do for guide_legend() (see Section 11.3.6) "],["shape.html", "11.2 Shape", " 11.2 Shape Values can be mapped to the shape aesthetic, most typically when you have a small number of discrete categories. Note: if the data variable contains more than 6 values it becomes difficult to distinguish between shapes, and will produce a warning. Although any one plot is unlikely to be readable with more than a 6 distinct markers, there are 25 possible shapes to choose from. The default scale_shape() function contains a single argument: set solid = TRUE (the default) to use a “palette” consisting of three solid shapes and three hollow shapes, or set solid = FALSE to use six hollow shapes: You can specify the marker types for each data value manually using scale_shape_manual(). For more information about manual scales see Section 12.4. "],["line-type.html", "11.3 Line type", " 11.3 Line type It is possible to map a variable onto the linetype aesthetic, which works best for discrete variables with a small number of categories, where scale_linetype() is an alias for scale_linetype_discrete(). Continuous variables cannot be mapped to line types unless scale_linetype_binned() is used: although there is a scale_linetype_continuous() function, all it does is produce an error. With five categories the above plot is quite difficult to read. The default “palette” for linetype is supplied by the scales::linetype_pal() function, and includes the 13 linetypes shown below: You can control the line type by specifying a string with up to 8 hexadecimal values. In this specification, -the first value is the length of the first line segment, the second value is the length of the first space between segments, and so on. This allows you to specify your own line types using scale_linetype_manual(), or alternatively, by passing a custom function to the palette argument. Note that the last four lines are blank, because the linetypes() function defined above returns NA when the number of categories exceeds 9. The scale_linetype() function contains a na.value argument used to specify what kind of line is plotted for these values. By default this produces a blank line, but you can override this by setting na.value = “dotted”: Valid line types can be set using a human readable character string: “blank”, “solid”, “dashed”, “dotted”, “dotdash”, “longdash”, and “twodash” are all understood. "],["manual-scales-1.html", "11.4 Manual scales", " 11.4 Manual scales Manual scales are just a list of valid values that are mapped to the unique discrete values. If you want to customize these scales, you need to create your own new scale with the “manual” version of each: scale_linetype_manual(), scale_shape_manual(), scale_colour_manual(), etc. The manual scale has one important argument, values, where you specify the values that the scale should produce if this vector is named, it will match the values of the output to the values of the input; otherwise it will match in order of the levels of the discrete variable. You will need some knowledge of the valid aesthetic values, which are described in vignette(“ggplot2-specs”). Manual scales have appeared earlier, in Sections 11.3.4 and 12.2. In the following example, you’ll see a creative use of scale_colour_manual() to display multiple variables on the same plot and show a useful legend. -In most plotting systems, you’d color the lines and then add a legend: That doesn’t work in ggplot because there’s no way to add a legend manually. Instead, give the lines informative labels: And then tell the scale how to map labels to colours: "],["identity-scales.html", "11.5 Identity Scales", " 11.5 Identity Scales Identity scales — such as scale_colour_identity() and scale_shape_identity() — are used when your data is already scaled such that the data and aesthetic spaces are the same. The code below shows an example where the identity scale is useful. luv_colours contains the locations of all R’s built-in colours in the LUV colour space (the space that HCL is based on). ## L u v col ## 1 9341.570 -3.370649e-12 0.0000 white ## 2 9100.962 -4.749170e+02 -635.3502 aliceblue ## 3 8809.518 1.008865e+03 1668.0042 antiquewhite ## 4 8935.225 1.065698e+03 1674.5948 antiquewhite1 ## 5 8452.499 1.014911e+03 1609.5923 antiquewhite2 ## 6 7498.378 9.029892e+02 1401.7026 antiquewhite3 "],["meeting-videos-11.html", "11.6 Meeting Videos", " 11.6 Meeting Videos 11.6.1 Cohort 1 Meeting chat log 00:22:22 Federica Gazzelloni: that’s very useful 00:23:08 Michael Haugen: Arrows! 00:31:57 Ryan Metcalf: https://ggplot2-book.org/scale-other.html#scale-manual 00:39:42 Federica Gazzelloni: where do you put the question mark? 00:39:49 Ryan Metcalf: It may only be me…I always forget how to pull installed datasets in R. If you run `data()` it will list all installed datasets. 00:39:55 Federica Gazzelloni: before the function' 00:40:03 Federica Gazzelloni: ?.. 00:40:17 Federica Gazzelloni: to have help information 00:40:28 Ryan Metcalf: I put it on the front: `?LakeHuron` 00:42:21 June Choe: BTW a tangent but something I just learned recently about the help syntax: `?` will exact match and `??` will regex match. So `?LakeHuron` and `??keHuro` also works (with the latter being a bit slower) 00:43:00 Ryan S: @June -- Whoa, that's cool 00:43:43 Ryan S: If anyone cares, here is the code that does the LakeHuron data WITH an automatic legend 00:43:45 Ryan S: data.frame(year = 1875:1972, level = as.numeric(LakeHuron)) %>% mutate(above = level + 5, below = level -5) %>% pivot_longer(cols = c("above", "below"), values_to = "new_level", names_to = "level_set") %>% ggplot(aes(x = year, y = new_level, groups = level_set, color = level_set)) + geom_line() 00:44:09 Federica Gazzelloni: cool 00:44:12 Ryan Metcalf: Awesome comment June! That would explain why I get “unexpected” behavior….I wasn’t sure of the differences. Thanks for clarifying! 00:44:22 June Choe: A more on-topic regex-y example of ?? would be like `??scale_.*_manual` 00:44:25 Ryan S: don't forget the groups = level_set…. 00:45:05 Federica Gazzelloni: 0.2 near the minimu 00:45:21 Federica Gazzelloni: scale alpha is 0 to 1 00:46:15 Federica Gazzelloni: thanks! 00:46:34 Ryan S: thanks, Lydia! 00:46:36 Federica Gazzelloni: they are all very useful features 00:47:56 June Choe: maybe we can do a week of tidytuesday session if many people of us are interested too! 00:48:06 Michael Haugen: ^^ 00:48:13 priyanka gagneja: sure 00:48:30 Michael Haugen: I like that; devote one week on a Tidy Tuesday and not a chapter. 00:48:32 Federica Gazzelloni: would love that @june 00:49:54 June Choe: I have an old (static) tidytuesday submission done in D3 if you want to peak at the code - https://observablehq.com/@yjunechoe/tidytuesday-2021-22 00:50:16 June Choe: (but agree with everything Ryan M's saying about how complex it is + pretty big learning curve) 00:51:49 June Choe: there's base svg renderer and also {svglite} which is developed by Rstudio https://svglite.r-lib.org/ 00:55:03 Michael Haugen: D3PO 00:55:12 June Choe: I have an example of r2d3 rendered in Rmarkdown with D3 code edited in RStudio - https://gist.github.com/yjunechoe/074e0020841fec3009b239583f305adc 00:55:40 June Choe: (Rstudio has javascript syntax highlight support so writing D3 wasn't too weird) 00:56:47 Michael Haugen: Does Shiny replace the need for D3 or is that apples and organges? 00:57:41 June Choe: IMO shiny is bulkier because it requires an R server backend but D3/JS can entirely be server-side (all calculations happen inside the user's browser) 00:57:52 June Choe: oops client-side* 00:57:58 Michael Haugen: thanks June. Makes sense. 00:58:19 Lydia Gibson: Off topic: I believe they will be removing the examples from the Ggplot2 book in the third edition. 00:59:08 Federica Gazzelloni: of course you can use it for scraping tables 00:59:21 Ryan S: June -- if you design your application for client-side calculations, I assume you have to optimize it so that it doesn't clog up the user's computer? 00:59:40 Ryan Metcalf: R2D3 Package Link: https://rstudio.github.io/r2d3/ 00:59:52 Ryan S: example -- you don't want to throw a million records at the client-side just because your server side can handle it? 00:59:53 June Choe: @Ryan S yaaa and i don't have much experience in that but thats a big topic 01:00:02 June Choe: Thank you! 01:00:18 Ryan S: I'll try it using Ryan M's client side. :-) 01:01:05 Federica Gazzelloni: that would be great! 01:01:10 Ryan Metcalf: Slack channel for Data Visualization Society: Datavizsociety.slack.com 01:01:16 Lydia Gibson: Yes please! 01:01:35 Ryan Metcalf: Finally, Pandoc link: https://pandoc.org/ 01:01:55 June Choe: didn't know about that slack - cool! "],["build-a-plot-layer-by-layer.html", "Chapter 12 Build a plot layer by layer", " Chapter 12 Build a plot layer by layer Learning objectives: Understanding ggplot layers How to control layers Application to real data "],["building-a-plot.html", "12.1 Building a plot", " 12.1 Building a plot In this chapter we talk about the grammar of graphics plots and their construction layer by layer. We use data from the {SpatialEpi} package: Let’s check what data is inside the package, we can use the NYleukemia which contains observations about leukemia cases in NY, as well as providing other information about population and spatials such as latidude and logitude where the cases were located. ## censustract.FIPS cases population ## 1 36007000100 3.08284 3540 ## 2 36007000200 4.08331 3560 ## 3 36007000300 1.08750 3739 ## censustract.FIPS x y ## 1 36007000100 -75.94087 42.10782 ## 2 36007000200 -75.93118 42.11099 ## 3 36007000300 -75.92011 42.11738 Let’s now make a first layer visualization using ggplot2 The second layer of our plot would take consideration of the geoms In general when we make a ggplot, we build the plot without thinking about the layers, but what is happening inside the hood when we add a layer? The layer() function is called for combining data, stat and geom. Layers are created using geom_* or stat_* calls or directly using the function: layer( geom = NULL, stat = NULL, data = NULL, mapping = NULL, position = NULL, params = list(), inherit.aes = TRUE, check.aes = TRUE, check.param = TRUE, show.legend = NA, key_glyph = NULL, layer_class = Layer ) To obtain the same results: layer() function components: mapping data geom stat position … "],["data.html", "12.2 Data", " 12.2 Data The layers of your plot can be populated with different datasets. Here we generate two new datasets from the df dataset. What geom_smooth() does behind the scenes? fit a model, in this case a loess model generate prediction, about the trend of the data In this example we create a grid of length of 50 to have an average trend to show in a secondary layer of the plot. ## # A tibble: 3 × 2 ## population cases ## <dbl> <dbl> ## 1 9 0.194 ## 2 274. 0.273 ## 3 540. 0.369 ## [1] 50 2 ## [1] 281 5 Next step would be to isolate the outliers (observations far away from predicted values), with the help of the resid() function to extract model residuals ## Call: ## loess(formula = cases ~ population, data = df) ## ## Number of Observations: 281 ## Equivalent Number of Parameters: 5.33 ## Residual Standard Error: 1.769 ## Trace of smoother matrix: 5.84 (exact) ## ## Control settings: ## span : 0.75 ## degree : 2 ## family : gaussian ## surface : interpolate cell = 0.2 ## normalize: TRUE ## parametric: FALSE ## drop.square: FALSE And build the residuals std error vector: ## censustract.FIPS cases population x y ## 1 36007012500 7.13834 5911 -75.69563 42.06164 ## 2 36007013000 7.11907 5088 -76.00001 42.12407 ## 3 36007013302 0.19008 8122 -76.06948 42.13547 ## [1] 16 5 Add a new layer with different data: grid 12.2.1 Exercises Recreate the plot in the book "],["aesthetic-mappings.html", "12.3 Aesthetic mappings", " 12.3 Aesthetic mappings The aesthetics: aes() allows for some omissions, under certain conditions. The complete syntax would be: ggplot( data = ..., mapping = aes(x = ..., y = ..., ...)) In general x = and y = inside the aes(x = ..., y = ..., ...) can be omitted. Sometimes R asks you about the missing mapping, and this is when more than one layer with different datasets is used. To solve the issue would be enough to add all the specifications inside the aesthetics. One more interesting thing to mention is: What manipulation happens when complex tranformations are set inside the aes()? As an example , if we apply the log transformation: (the example is from the diamond dataset) aes(log(carat), log(price)) What happens behind the scenes is an explicit call to dplyr::mutate() (The symbol $ is not allowed inside the aes()) 12.3.1 Specifying the aesthetics in the plot vs. in the layers All of these alternatives are allowed: ggplot(mpg, aes(displ, hwy, colour = class)) + geom_point() ggplot(mpg, aes(displ, hwy)) + geom_point(aes(colour = class)) ggplot(mpg, aes(displ)) + geom_point(aes(y = hwy, colour = class)) ggplot(mpg) + geom_point(aes(displ, hwy, colour = class)) But under some conditions, such as the use of a geom_smooth(), the position of secondary arguments need to be specified in the layer parameters, as it is important for releasing correct results. In the first case the smooth line doesn’t show up. 12.3.2 Setting vs. mapping What is the difference between mapping and setting an aesthetic? To map an aesthetic to a variable there are different options, you can put the color argument (or other secondary arguments) inside or outside the aesthetic with different results: geom_...(aes(colour = cut)) geom_...(colour="red") Or set an aesthetic to a constant, a specific color-value, in case of a color argument: ...,colour = "red") An alternative would be to use the function: scale_colour_identity() In case of more than one geom_smooth() being used in the plot, the different colors can be specified with scale_color_...() function. "],["geoms-1.html", "12.4 Geoms", " 12.4 Geoms geoms stands for geometric objects for short. Some geoms requires both x and y while others not, as well as other require more than simply x and y, such as xmax, ymax etc. If you do geom_ and tab all the available geoms appear in a list for you to choose from. As an example here we use the geom_quantile() to represent a smoothed quantile regression and the geom_rug() for maginal rugs. 12.4.1 Exercises Discussion The book suggests to download the cheatsheets: ggplot2 cheatsheet (Ex.5) Display how a variable has changed over time: source ## # A tibble: 3 × 6 ## date pce pop psavert uempmed unemploy ## <date> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 1967-07-01 507. 198712 12.6 4.5 2944 ## 2 1967-08-01 510. 198911 12.6 4.7 2945 ## 3 1967-09-01 516. 199113 11.9 4.6 2958 Show the detailed distribution of a single variable The distribution can be described using a frequency table and histogram. Focus attention on the overall trend in a large dataset Interesting resource Draw a map Label outlying points "],["stats.html", "12.5 Stats", " 12.5 Stats There are several stat_...() functions used to transform the data by summarizing information. For example the stat_ecdf() compute the empirical cumulative distribution plot Here we use stat_summary() function for *categorical data** 12.5.1 Generated variables from the stat_...() functions stat takes a data frame as input and returns a data frame as output. Here we use the diamonds dataset, to see hoe the after_stat() can be applied 12.5.2 Exercises What stats were used to create the Q-Q plot? What stats were used to create the Normal density? "],["position-adjustments.html", "12.6 Position adjustments", " 12.6 Position adjustments The position is very important for some geoms: position_nudge() position_jitter() position_jitterdodge() all of them can be used inside the geom: geom_count() "],["meeting-videos-12.html", "12.7 Meeting Videos", " 12.7 Meeting Videos 12.7.1 Cohort 1 Meeting chat log 00:08:06 June Choe: hey all! 00:08:13 Federica Gazzelloni: Hi!! 00:08:18 June Choe: thanks for moving the time to this hour 00:08:37 Federica Gazzelloni: That’s better for me either 00:08:43 Lydia Gibson: https://imstat.org/meetings-calendar/ims-international-conference-on-statistics-and-data-science-icsds/ 00:08:51 June Choe: (now i get to call in as I eat lunch at the student common space) 00:57:46 Kent Johnson: Thank you, this was an interesting chapter! 00:57:52 Michael Haugen: Thank you! 00:57:57 June Choe: thanks! 00:58:12 Ryan Metcalf: Thank you Federica! 00:58:23 Stan Piotrowski: Thanks for a great presentation! Meeting chat log 00:14:06 June Choe: gray is default iirc 00:23:25 June Choe: I wonder if something like this works with datetime values on x scale_x_date(date_breaks = "2 weeks", offset = 31) 00:23:54 June Choe: (or offset = -31, maybe) 00:25:30 June Choe: I see - I'll play around with it more ! 00:36:05 Federica Gazzelloni: rle {base}: Compute the lengths and values of runs of equal values in a vector – or the reverse operation. 00:36:27 Ryan Metcalf: Sorry team, I have to drop. Great job Kent! 00:48:15 Federica Gazzelloni: related with cumulative values 00:48:29 Priyanka Gagneja: Thanks Ryan. See you next time 00:50:34 June Choe: It's discussed in Advanced R book Ch. 10.2.4! https://adv-r.hadley.nz/function-factories.html?q=stateful#stateful-funs 00:50:43 Federica Gazzelloni: thanks! 00:52:04 Priyanka Gagneja: @June this in response the environment() ? 01:05:03 June Choe: I have the 2nd edition of R Graphics book from 2011 that has a chapter on ggplot2 back then and the code has not changed (i'll see if I can upload a page from that) 01:06:06 June Choe: they also changed some syntax from tidyr in the new update from like a few days ago 01:06:16 June Choe: (to make it easier for users especailly with respect to nest!) 01:07:39 June Choe: thanks! "],["scales-and-guides.html", "Chapter 13 Scales and Guides", " Chapter 13 Scales and Guides Learning objectives: Illustrate that there is nothing preventing you from transforming other kinds of scales beyond continuous position scale Show how concepts for position scales apply elsewhere Discuss the theory underpinning scales and guides "],["theory-of-scales-and-guides.html", "13.1 Theory of scales and guides", " 13.1 Theory of scales and guides Each scale is a function from a region in data space to a region in aesthetic space. The axis or legend is the inverse function, known as the guide: it allows you to convert visual properties back to data. Surprisingly, axes and legends are the same type of thing, but while they look very different they have the same purpose: to allow you to read observations from the plot and map them back to their original values. The commonalities between the two are illustrated below: Argument name Axis Legend name Label Title breaks Ticks & grid line Key labels Tick label Key label However, legends are more complicated than axes, and consequently there are a number of topics that are specific to legends: 1. A legend can display multiple aesthetics (e.g. colour and shape), from multiple layers (Section 15.7.1), and the symbol displayed in a legend varies based on the geom used in the layer (Section 15.8) 2. Axes always appear in the same place. Legends can appear in different places, so you need some global way of positioning them. (Section 11.7) 3. Legends have more details that can be tweaked: should they be displayed vertically or horizontally? How many columns? How big should the keys be? This is discussed in (Section 15.5) 13.1.1 Scale specification An important property of ggplot2 is the principle that every aesthetic in your plot is associated with exactly one scale. For instance, when you write this ggplot(mpg, aes(displ, hwy)) + geom_point(aes(colour = class)) ggplot2 adds a default scale for each aesthetic used in the plot: ggplot(mpg, aes(displ, hwy)) + geom_point(aes(colour = class)) + scale_x_continuous() + scale_y_continuous() + scale_colour_discrete() ggplot(mpg, aes(displ, hwy)) + geom_point(aes(colour = class)) + scale_x_continuous(name = "A really awesome x axis label") + scale_y_continuous(name = "An amazingly great y axis label") The use of + to “add” scales to a plot is a little misleading because if you supply two scales for the same aesthetic, the last scale takes precedence: ggplot(mpg, aes(displ, hwy)) + geom_point() + scale_x_continuous(name = "Label 1") + scale_x_continuous(name = "Label 2") #> Scale for 'x' is already present. Adding another scale for 'x', which will #> replace the existing scale. ggplot(mpg, aes(displ, hwy)) + geom_point() + scale_x_continuous(name = "Label 2") ggplot(mpg, aes(displ, hwy)) + geom_point(aes(colour = class)) + scale_x_sqrt() + scale_colour_brewer() 13.1.2 Naming scheme The scale functions intended for users all follow a common naming scheme. You’ve probably already figured out the scheme, but to be concrete, it’s made up of three pieces separated by “_“: 1. scale 2. The name of the primary aesthetic (e.g., colour, shape or x) 3. The name of the scale (e.g., continuous, discrete, brewer). 13.1.3 Fundamental scale types All scale functions in ggplot2 belong to one of three fundamental types: continuous scales, discrete scales, and binned scales. Each fundamental type is handled by one of three scale constructor functions: continuous_scale(), discrete_scale() and binned_scale(). Although you should never need to call these constructor functions, they provide the organizing structure for scales and it is useful to know about them. "],["scale-breaks.html", "13.2 Scale Breaks", " 13.2 Scale Breaks Discussion of what unifies the concept of breaks across continuous, discrete and binned scales: they are specific data values at which the guide needs to display something. Include additional detail about break functions. "],["scale-limits.html", "13.3 Scale Limits", " 13.3 Scale Limits Section 15.1 introduced the concept that a scale defines a mapping from the data space to the aesthetic space. Scale limits are an extension of this idea: they dictate the region of the data space over which the mapping is defined. For continuous and binned scales, the data space is inherently continuous and one-dimensional, so the limits can be specified by two end points. For discrete scales, however, the data space is unstructured and consists only of a set of categories: as such the limits for a discrete scale can only be specified by enumerating the set of categories over which the mapping is defined. The toolbox chapters outline the common practical goals for specifying the limits: for position scales the limits are used to set the end points of the axis, for example. This leads naturally to the question of what ggplot2 should do if the data set contains “out of bounds” values that fall outside the limits. The default behaviour in ggplot2 is to convert out of bounds values to NA. We can override this default by setting oob argument of the scale, a function that is applied to all observations outside the scale limits. The default is scales::oob_censor() which replaces any value outside the limits with NA. Another option is scales::oob_squish() which squishes all values into the range. An example using a fill scale is shown below: The first plot the default fill colours are shown, ranging from dark blue to light blue. In the second plot the scale limits for the fill aesthetic are reduced so that the values for the three rightmost bars are replace with NA and are mapped to a grey shade. In some cases this is desired behaviour but often it is not: the third plot addresses this by modifying the oob function appropriately. "],["scale-guides.html", "13.4 Scale guides", " 13.4 Scale guides Scale guides are more complex than scale names: where the name argument (and labs() ) takes text as input, the guide argument (and guides()) require a guide object created by a guide function such as guide_colourbar() and guide_legend(). These arguments to these functions offer additional fine control over the guide. The table below summarises the default guide functions associated with different scale types: Scale type Default guide type continuous scales for colour / fill aesthetics colourbar binned scales for colour/fill aesthetics coloursteps position scales (continuous, binned, and discrete) axis discrete scales (except position scales) legend binned scalesd (except position/colour/fill scales) bins Each of these guide types has appeared earlier in the toolbox: guide_colourbar() is discussed in Section 11.2.5 guide_coloursteps() is discussed in Section 11.4.2 guide_axis() is discussed in Section 10.3.2 guide_legend() is discussed in Section 11.3.6 guide_bins() is discussed in Section 12.1.2 In addition to the functionality discussed in those sections, the guide functions have many arguments that are equivalent to theme settings like text colour, size, font etc, but only apply to a single guide. For information about those settings, see Chapter 18. "],["scale-transformation.html", "13.5 Scale transformation", " 13.5 Scale transformation The most common use for scale transformations is to adjust a continuous position scale, as discussed in Section 10.1.7. However, they can sometimes be helpful to when applied to other aesthetics. Often this is purely a matter of visual emphasis. An example of this for the Old Faithful density plot is shown below. The linearly mapped scale on the left makes it easy to see the peaks of the distribution, whereas the transformed representation on the right makes it easier to see the regions of non-negligible density around those peaks: Transforming size aesthetics is also possible: In the plot on the left, the z value is naturally interpreted as a “weight”: if each dot corresponds to a group, the z value might be the size of the group. In the plot on the right, the size scale is reversed, and z is more naturally interpreted as a “distance” measure: distant entities are scaled to appear smaller in the plot. "],["legend-merging-and-splitting.html", "13.6 Legend merging and splitting", " 13.6 Legend merging and splitting There is always a one-to-one correspondence between position scales and axes. But the connection between non-position scales and legend is more complex: one legend may need to draw symbols from multiple layers (“merging”), or one aesthetic may need multiple legends (“splitting”). 13.6.1 Merging legends Merging legends occurs quite frequently when using ggplot2. For example, if you’ve mapped colour to both points and lines, the keys will show both points and lines. If you’ve mapped fill colour, you get a rectangle. Note the way the legend varies in the plots below: By default, a layer will only appear if the corresponding aesthetic is mapped to a variable with aes(). You can override whether or not a layer appears in the legend with show.legend: FALSE to prevent a layer from ever appearing in the legend; TRUE forces it to appear when it otherwise wouldn’t. Using TRUE can be useful in conjunction with the following trick to make points stand out: ggplot2 tries to use the fewest number of legends to accurately convey the aesthetics used in the plot. It does this by combining legends where the same variable is mapped to different aesthetics. The figure below shows how this works for points: if both colour and shape are mapped to the same variable, then only a single legend is necessary. In order for legends to be merged, they must have the same name. So if you change the name of one of the scales, you’ll need to change it for all of them. One way to do this is by using labs() helper function: 13.6.2 Splitting legends Splitting a legend is a much less common data visualization task. In general it is not advisable to map one aesthetic (e.g. colour) to multiple variables, and so by default ggplot2 does not allow you to “split” the colour aesthetic into multiple scales with separate legends. Nevertheless, there are exceptions to this general rule, and it is possible to override this behaviour using the ggnewscale package. The ggnewscale::new_scale_colour() command acts as an instruction to ggplot2 to initialize a new colour scale: scale and guide commands that appear above the new_scale_colour() command will be applied to the first colour scale, and commands that appear below are applied to the second colour scale. To illustrate this the plot on the left uses geom_point() to display a large marker for each vehicle make in the mpg data, with a single colour scale that maps to the year. On the right, a second geom_point() layer is overlaid on the plot using small markers: this layer is associated with a different colour scale, used to indicate whether the vehicle has a 4-cylinder engine. Additional details, including functions that apply to other scale types, are available on the package website, https://github.com/eliocamp/ggnewscale. "],["legend-key-glyphs.html", "13.7 Legend key glyphs", " 13.7 Legend key glyphs In most cases the default glyphs shown in the legend key will be appropriate to the layer and the aesthetic. Should you need to override this behaviour, the key_glyph argument can be used to associate a particular layer with a different kind of glyph. For example: More precisely, each geom is associated with a function such as draw_key_path(), draw_key_boxplot() or draw_key_path() which is responsible for drawing the key when the legend is created. You can pass the desired key drawing function directly: for example, base + geom_line(key_glyph = draw_key_timeseries) would also produce the plot shown above. For more information about changing key glyphs, see https://www.emilhvitfeldt.com/post/changing-glyph-in-ggplot2/. "],["meeting-videos-13.html", "13.8 Meeting Videos", " 13.8 Meeting Videos 13.8.1 Cohort 1 Meeting chat log 00:07:09 June Choe: hello! 00:08:55 Federica Gazzelloni: Hello! 00:46:16 Kent Johnson: Examples of key glyphs: https://www.emilhvitfeldt.com/post/changing-glyph-in-ggplot2/ 00:48:30 June Choe: that one is just two overlapping points i think (with different sizes) 00:48:35 June Choe: (yes what kent said) "],["coordinate-systems.html", "Chapter 14 Coordinate systems", " Chapter 14 Coordinate systems Learning objectives: What are coord_<functions> ? What are the differences between coord_<functions> in {ggplot2} ? How to use coordinate systems in {ggplot2} "],["introduction-5.html", "14.1 Introduction", " 14.1 Introduction The coordinate system in {ggplot2} can be managed with the use of coord_<functions>. This is done when we need to: zoom into a plot in a particular area of the plot flip the axis of a plot set a fixed aspect ratio of a plot transform coordinates change the shape of the plot set the coordinates for a map projection library(tidyverse) library(patchwork) iris %>% head() Sepal.Length Sepal.Width Petal.Length Petal.Width Species 1 5.1 3.5 1.4 0.2 setosa 2 4.9 3.0 1.4 0.2 setosa 3 4.7 3.2 1.3 0.2 setosa 4 4.6 3.1 1.5 0.2 setosa 5 5.0 3.6 1.4 0.2 setosa 6 5.4 3.9 1.7 0.4 setosa "],["linear-coordinate-systems.html", "14.2 Linear coordinate systems", " 14.2 Linear coordinate systems coord_cartesian(): the default Cartesian coordinate system, where the 2d position of an element is given by the combination of the x and y positions. coord_flip(): Cartesian coordinate system with x and y axes flipped. coord_fixed(): Cartesian coordinate system with a fixed aspect ratio. coord_cartesian() p1 <- ggplot(iris, aes(Sepal.Length, Sepal.Width)) + geom_point(aes(fill=Species), show.legend = F, shape=21,color="grey20",alpha=0.5) + geom_smooth(color="pink") + theme_light() p1 | p1 + scale_x_continuous(limits = c(5, 6)) | p1 + coord_cartesian(xlim = c(5, 6)) coord_flip() p2 <- ggplot(iris, aes(Sepal.Length, Sepal.Width)) + geom_point(aes(fill=Species), show.legend = F, shape=21,color="grey20",alpha=0.5) + geom_smooth(color="pink") + theme_light() p3 <- ggplot(iris, aes(Sepal.Width,Sepal.Length)) + geom_point(aes(fill=Species), show.legend = F, shape=21,color="grey20",alpha=0.5) + geom_smooth(color="pink") + theme_light() p2 | p2 + coord_flip() | p3 (the smooth is fit to the rotated data). coord_fixed() p3 | p3 + coord_fixed() "],["non-linear-coordinate-systems.html", "14.3 Non-linear coordinate systems", " 14.3 Non-linear coordinate systems coord_polar(): Polar coordinates. coord_map()/coord_quickmap()/coord_sf(): Map projections. coord_trans(): Apply arbitrary transformations to x and y positions, after the data has been processed by the stat. coord_polar() p4 <- iris %>% ggplot(aes(x = Species, y = Petal.Width)) + geom_col(aes(color=Species,fill=Species),show.legend = F)+ theme_light() p4 + coord_polar(theta = "x") | p4 + coord_polar(theta = "y") 14.3.1 Example: Coord_polar() with DuBoisChallenge N°8 data source: DuBois data portraits df <- read_csv("https://raw.githubusercontent.com/ajstarks/dubois-data-portraits/master/challenge/2022/challenge08/data.csv") df2 <- df %>% arrange(-Year) df2[7,1] <- 1875 df2[7,2] <- 0 df2[7,3] <- 0 df2 %>% ggplot() + geom_line(data= subset(df2, Year %in% c(1875,1875)), mapping = aes(x=Year, y= `Houshold Value (Dollars)`), color="#FFCDCB",size=6) + geom_line(data= subset(df2, Year%in%c(1875,1875,1880)), mapping= aes(x=Year +2, y= `Houshold Value (Dollars)`), color="#989EB4",size=6) + geom_line(data= subset(df2, Year%in%c(1875,1875,1880,1885)), mapping= aes(x=Year +4, y= `Houshold Value (Dollars)`), color="#b08c71",size=6) + geom_line(data= subset(df2, Year%in%c(1875,1875,1880,1885,1890)), mapping= aes(x=Year +6, y= `Houshold Value (Dollars)`), color="#FFC942",size=6) + geom_line(data= subset(df2, Year%in%c(1875,1875,1880,1885,1890,1895)), mapping= aes(x=Year +8, y= `Houshold Value (Dollars)`), color="#EFDECC", size=6) + geom_line(mapping= aes(x=Year +10, y= `Houshold Value (Dollars)`), color="#F02C49",size=6) + coord_polar(theta = "y", start = 0, direction = 1, clip = "off") + # other scales that can be used: #scale_x_reverse(expand=expansion(mult=c(-0.9,-0.1),add=c(29,-0.1))) + #scale_y_continuous(expand=expansion(mult=c(0.09,0.01),add=c(0,-790000))) + scale_x_reverse(expand=expansion(add=c(11,-5))) + scale_y_continuous(expand=expansion(add=c(0,-600000))) + labs(title="ASSESSED VALUE OF HOUSEHOLD AND KITCHEN FURNITURE OWNED BY GEORGIA NEGROES.")+ theme_void() + theme(text = element_text(face="bold", color="grey27"), aspect.ratio =2/1.9, #y/x plot.background = element_rect(color= "#d9ccbf", fill= "#d9ccbf"), plot.title = element_text(hjust=0.5,size=9)) coord_trans() rect <- data.frame(x = 50, y = 50) line <- data.frame(x = c(1, 200), y = c(100, 1)) p6 <- ggplot(mapping = aes(x, y)) + geom_tile(data = rect, aes(width = 50, height = 50)) + geom_line(data = line) + xlab(NULL) + ylab(NULL) p6 p6 + coord_trans(y = "log10") p7 <- ggplot(iris, aes(Sepal.Length, Petal.Length)) + stat_bin2d() + geom_smooth(method = "lm") + xlab(NULL) + ylab(NULL) + theme(legend.position = "none") p7 #> `geom_smooth()` using formula 'y ~ x' # Better fit on log scale, but harder to interpret p7 + scale_x_log10() + scale_y_log10() #> `geom_smooth()` using formula 'y ~ x' # Fit on log scale, then backtransform to original. # Highlights lack of expensive diamonds with large carats pow10 <- scales::exp_trans(10) p7 + scale_x_log10() + scale_y_log10() + coord_trans(x = pow10, y = pow10) coord_map()/coord_quickmap()/coord_sf() world <- map_data("world") worldmap <- ggplot(world, aes(long, lat, group = group)) + geom_path() + scale_y_continuous(NULL, breaks = (-2:3) * 30, labels = NULL) + scale_x_continuous(NULL, breaks = (-4:4) * 45, labels = NULL) worldmap + coord_quickmap() | worldmap + coord_map("ortho") | worldmap + coord_map("stereographic") "],["meeting-videos-14.html", "14.4 Meeting Videos", " 14.4 Meeting Videos 14.4.1 Cohort 1 Meeting chat log 00:08:33 June Choe: hi all :) 00:08:50 Federica Gazzelloni: Hi 00:09:48 June Choe: yeah I think folks can catch up on youtube maybe 00:28:00 June Choe: thats very neat - didn't know you could "squish" the polar-transformed shapes with scale expansion 00:38:32 June Choe: An interesting discussion for coord_polar on twitter - https://twitter.com/mattansb/status/1506620436771229715?s=20&t=I4IebpuwA_ZxDwzA4BqqwQ 00:38:45 June Choe: I was in an exchange with @mattansb on how to "crop" polar coordinate plots 00:39:15 June Choe: this was his solution, and I find it quite nice - https://mattansb.github.io/MSBMisc/reference/crop_coord_polar.html 00:40:30 June Choe: this was great - thank you! 00:41:03 June Choe: sounds good! "],["faceting-2.html", "Chapter 15 Faceting", " Chapter 15 Faceting Learning objectives: Facet wrap Facet grid Controlling scales Missing faceting variables Grouping vs. faceting Continuous variables "],["facets.html", "15.1 Facets", " 15.1 Facets library(tidyverse) mpg2 <- subset(mpg, cyl != 5 & drv %in% c("4", "f") & class != "2seater") base <- ggplot(mpg2, aes(displ, hwy)) + geom_blank() + xlab(NULL) + ylab(NULL) mpg2%>%count(class) # A tibble: 6 × 2 class n <chr> <int> 1 compact 45 2 midsize 41 3 minivan 11 4 pickup 33 5 subcompact 24 6 suv 51 base + facet_wrap(~class, ncol = 3) base + facet_wrap(~class, ncol = 3, as.table = FALSE) base + facet_wrap(~class, nrow = 3) base + facet_wrap(~class, nrow = 3, dir = "v") base + facet_grid(. ~ cyl) base + facet_grid(drv ~ .) base + facet_grid(drv ~ cyl) p <- ggplot(mpg, aes(cty, hwy)) + geom_abline() + geom_jitter(width = 0.1, height = 0.1) p + facet_grid(drv ~ cyl) facet_wrap(~cyl) <ggproto object: Class FacetWrap, Facet, gg> compute_layout: function draw_back: function draw_front: function draw_labels: function draw_panels: function finish_data: function init_scales: function map_data: function params: list setup_data: function setup_params: function shrink: TRUE train_scales: function vars: function super: <ggproto object: Class FacetWrap, Facet, gg> p+ facet_wrap(~cyl, scales = "free_y") economics_long%>%count(date) # A tibble: 574 × 2 date n <date> <int> 1 1967-07-01 5 2 1967-08-01 5 3 1967-09-01 5 4 1967-10-01 5 5 1967-11-01 5 6 1967-12-01 5 7 1968-01-01 5 8 1968-02-01 5 9 1968-03-01 5 10 1968-04-01 5 # ℹ 564 more rows ggplot(economics_long, aes(date, value)) + geom_line() + facet_wrap(~variable, scales = "free_y", ncol = 1) mpg2$model <- reorder(mpg2$model, mpg2$cty) mpg2$manufacturer <- reorder(mpg2$manufacturer, -mpg2$cty) ggplot(mpg2, aes(cty, model)) + geom_point() + facet_grid(manufacturer ~ ., scales = "free", space = "free") + theme(strip.text.y = element_text(angle = 0)) df1 <- data.frame(x = 1:3, y = 1:3, gender = c("f", "f", "m")) df2 <- data.frame(x = 2, y = 2) ggplot(df1, aes(x, y)) + geom_point(data = df2, colour = "red", size = 2) + geom_point() + facet_wrap(~gender) df <- data.frame( x = rnorm(120, c(0, 2, 4)), y = rnorm(120, c(1, 2, 1)), z = letters[1:3] ) ggplot(df, aes(x, y)) + geom_point(aes(colour = z)) ggplot(df, aes(x, y)) + geom_point(aes(color=z)) + facet_wrap(~z) df_sum <- df %>% group_by(z) %>% summarise(x = mean(x), y = mean(y)) %>% rename(z2 = z) ggplot(df, aes(x, y)) + geom_point() + geom_point(data = df_sum, aes(colour = z2), size = 4) + facet_wrap(~z) df2 <- dplyr::select(df, -z) ggplot(df, aes(x, y)) + geom_point(data = df2, colour = "grey70") + geom_point(aes(colour = z)) + facet_wrap(~z) age<-seq(18,60,1) id <- seq(1,42,1) my_df <- as.data.frame(cbind(id,age)) my_df %>% mutate(age_cat=cut_interval(age,length=5))%>%head() id age age_cat 1 1 18 [15,20] 2 2 19 [15,20] 3 3 20 [15,20] 4 4 21 (20,25] 5 5 22 (20,25] 6 6 23 (20,25] # Bins of width 1 mpg2$disp_w <- cut_width(mpg2$displ, 1) # Six bins of equal length mpg2$disp_i <- cut_interval(mpg2$displ, 6) # Six bins containing equal numbers of points mpg2$disp_n <- cut_number(mpg2$displ, 6) plot <- ggplot(mpg2, aes(cty, hwy)) + geom_point() + labs(x = NULL, y = NULL) plot + facet_wrap(~disp_w, nrow = 1) "],["meeting-videos-15.html", "15.2 Meeting Videos", " 15.2 Meeting Videos 15.2.1 Cohort 1 "],["themes.html", "Chapter 16 Themes", " Chapter 16 Themes Learning objectives: How can I customize the output of my plot What are the functions theme_<function>() and theme() "],["theme.html", "16.1 Theme", " 16.1 Theme Plots can be customized by adding these function to your plot: scale_fill/color_ theme_ theme() … 16.1.1 Complete themes In ggplo2 there are preset themes ready to use: library(tidyverse) df <- data.frame(x = 1:3, y = 1:3) base <- ggplot(df, aes(x, y)) + geom_point() p1<-base + theme_grey() + ggtitle("theme_grey()") p2<-base + theme_bw() + ggtitle("theme_bw()") p3<-base + theme_linedraw() + ggtitle("theme_linedraw()") library(patchwork) p1+p2+p3 p4<-base + theme_light() + ggtitle("theme_light()") p5<- base + theme_dark() + ggtitle("theme_dark()") p6<-base + theme_minimal() + ggtitle("theme_minimal()") p4+p5+p6 p7<-base + theme_classic() + ggtitle("theme_classic()") p8<-base + theme_void() + ggtitle("theme_void()") p7+p8 Or, you can use other packages such as {ggthemes} or other here: ggplot extension gallery library(ggthemes) p9<-base + theme_tufte() + ggtitle("theme_tufte()") p10<-base + theme_solarized() + ggtitle("theme_solarized()") p11<-base + theme_excel() + ggtitle("theme_excel()") p9+p10+p11 Modifying complete theme components with theme() function "],["plot-elements-of-a-theme.html", "16.2 Plot elements of a theme", " 16.2 Plot elements of a theme Axis elements Legend elements Panel elements Faceting elements Look at ?theme() funtion in your help pane of RStudio for more info. "],["meeting-videos-16.html", "16.3 Meeting Videos", " 16.3 Meeting Videos 16.3.1 Cohort 1 "],["programming-with-ggplot2.html", "Chapter 17 Programming with ggplot2", " Chapter 17 Programming with ggplot2 Learning objectives: Programming single and multiple components Use components, annotation, and additional arguments in a plot Functional programming What are the components of a plot? data.frame aes() Scales Coords systems Theme components "],["programming-single-and-multiple-components.html", "17.1 Programming single and multiple components", " 17.1 Programming single and multiple components In ggplot2 it is possible to build up plot components easily. This is a good practice to reduce duplicated code. Generalising code allows you with more flexibility when making customised plots. 17.1.1 Components One example of a component of a plot is this one below: bestfit <- geom_smooth( method = "lm", se = FALSE, colour = alpha("steelblue", 0.5), size = 2) This single component can be placed inside the syntax of the grammar of graphics and used as a plot layer. ggplot(mpg, aes(cty, hwy)) + geom_point() + bestfit Another way is to bulid a layer passing through build a function: geom_lm <- function(formula = y ~ x, colour = alpha("steelblue", 0.5), size = 2, ...) { geom_smooth(formula = formula, se = FALSE, method = "lm", colour = colour, size = size, ...) } And the apply the function layer to the plot ggplot(mpg, aes(displ, 1 / hwy)) + geom_point() + geom_lm(y ~ poly(x, 2), size = 1, colour = "red") The book points out attention to the “open” parameter …. A suggestion is to use it inside the function instead of in the function parameters definition. Instead of only one component, we can build a plot made of more components. geom_mean <- function() { list( stat_summary(fun = "mean", geom = "bar", fill = "grey70"), stat_summary(fun.data = "mean_cl_normal", geom = "errorbar", width = 0.4) ) } Whit this result: ggplot(mpg, aes(class, cty)) + geom_mean() "],["use-components-annotation-and-additional-arguments-in-a-plot.html", "17.2 Use components, annotation, and additional arguments in a plot", " 17.2 Use components, annotation, and additional arguments in a plot We have just seen some examples on how to make new components, what if we want to know more about existing components? As an example the borders() option function, provided by {ggplot2} to create a layer of map borders. “A quick and dirty way to get map data (from the maps package) on to your plot.” borders <- function(database = "world", regions = ".", fill = NA, colour = "grey50", ...) { df <- map_data(database, regions) geom_polygon( aes_(~long, ~lat, group = ~group), data = df, fill = fill, colour = colour, ..., inherit.aes = FALSE, show.legend = FALSE ) } library(maps) data(us.cities) capitals <- subset(us.cities, capital == 2) ggplot(capitals, aes(long, lat)) + borders("world", xlim = c(-130, -60), ylim = c(20, 50)) + geom_point(aes(size = pop)) + scale_size_area() + coord_quickmap() We can even add addtional arguments, such as those ones to modify and add things: modifyList() do.call() geom_mean <- function(..., bar.params = list(), errorbar.params = list()) { params <- list(...) bar.params <- modifyList(params, bar.params) errorbar.params <- modifyList(params, errorbar.params) bar <- do.call("stat_summary", modifyList( list(fun = "mean", geom = "bar", fill = "grey70"), bar.params) ) errorbar <- do.call("stat_summary", modifyList( list(fun.data = "mean_cl_normal", geom = "errorbar", width = 0.4), errorbar.params) ) list(bar, errorbar) } And here is the result: ggplot(mpg, aes(class, cty)) + geom_mean( colour = "steelblue", errorbar.params = list(width = 0.5, size = 1) ) "],["functional-programming.html", "17.3 Functional programming", " 17.3 Functional programming An example is to make a geom. For this we can have a look at the “Corporate Reputation” data from #TidyTuesday 2022 week22. poll <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-05-31/poll.csv') reputation <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-05-31/reputation.csv') rep2<-reputation%>% group_by(company,industry)%>% summarize(score,rank)%>% ungroup()%>% mutate(year=2022) full <- poll%>% filter(!is.na(year))%>% full_join(rep2,by=c("2022_rank"="rank","2022_rq"="score","company","industry","year")) %>% count(year,company,industry,"rank"=`2022_rank`,"score"=`2022_rq`,sort=T) %>% arrange(-year) ################## # mapping = aes(x = fct_reorder(x,-y), y = y, fill = y, color = y, label = y) rank_plot <- function(data,mapping) { data %>% ggplot(mapping)+ # aes(x=fct_reorder(x,-y),y=y) geom_col(width =0.3, # aes(fill=rank) show.legend = F)+ geom_text(hjust=0,fontface="bold", # aes(label=rank,color=rank), show.legend = F)+ scale_y_discrete(expand = c(0, 0, .5, 0))+ coord_flip()+ ggthemes::scale_fill_continuous_tableau(palette = "Green-Gold")+ ggthemes::scale_color_continuous_tableau(palette = "Green-Gold")+ labs(title="", x="",y="")+ theme(axis.text.x = element_blank(), axis.text.y = element_text(face="bold"), axis.ticks.x = element_blank(), axis.ticks.y = element_line(size=2), panel.grid.major.x = element_blank(), panel.grid.minor.x = element_blank(), panel.grid.major.y = element_line(size=2), plot.background = element_rect(color="grey95",fill="grey95"), panel.background = element_rect(color="grey92",fill="grey92")) } df<-full%>% filter(year==2017, industry=="Retail") rank_plot(data = df, mapping = aes(x=fct_reorder(company,-rank),y=rank, fill = rank, color = rank, label = rank)) "],["references.html", "17.4 References", " 17.4 References extending ggplot2 functions expressions functional programming advanced R - functionals "],["meeting-videos-17.html", "17.5 Meeting Videos", " 17.5 Meeting Videos 17.5.1 Cohort 1 Meeting chat log 00:41:31 Priyanka Gagneja: There’s a lot of disturbance :( 01:00:48 Priyanka Gagneja: https://plotly.com/ggplot2/setting-graph-size/ "],["internals-of-ggplot2.html", "Chapter 18 Internals of ggplot2", " Chapter 18 Internals of ggplot2 Learning Objectives What is the difference between user-facing code and internal code? What is the distinction between ggplot_build() and ggplot_gtable()? What the division of labor between {ggplot2} and {grid}? What is the basic structure of/motivation for ggproto? library(ggplot2) library(ggtrace) # remotes::install_github("yjunechoe/ggtrace") library(purrr) library(dplyr) 18.0.0.1 Introduction (the existence of internals) The user-facing code that defines a ggplot on the surface is not the same as the internal code that creates a ggplot under the hood. In this chapter, we’ll learn about how the internal code operates and develop some intuitions about thinking about the internals, starting with these two simple examples of mismatches between surface and underlying form: 18.0.0.2 Case 1: Order You can change the order of some “layers” without change to the graphical output. For example, scale_*() can be added anywhere and always ends up applying for the whole plot: ggplot(mtcars, aes(mpg, hp, color = as.factor(am))) + scale_x_log10() + #< scale first geom_point() + geom_smooth() ggplot(mtcars, aes(mpg, hp, color = as.factor(am))) + geom_point() + scale_x_log10() + #< scale middle geom_smooth() Though the order of geom_*() and stat_*() matters for order of drawing: ggplot(mtcars, aes(mpg, hp, color = as.factor(am))) + geom_point() + geom_smooth(fill = "black") ggplot(mtcars, aes(mpg, hp, color = as.factor(am))) + geom_smooth(fill = "black") + geom_point() 18.0.0.3 Case 2: Modularity We know that user-facing “layer” code that we add to a ggplot with + are stand-alone functions: lm_smooth <- geom_smooth(method = "lm", formula = y ~ x) lm_smooth geom_smooth: na.rm = FALSE, orientation = NA, se = TRUE stat_smooth: na.rm = FALSE, orientation = NA, se = TRUE, method = lm, formula = y ~ x position_identity When we add this object to different ggplots, it materializes in different ways: ggplot(mtcars, aes(mpg, hp)) + lm_smooth ggplot(mtcars, aes(wt, disp)) + lm_smooth "],["the-plot-method.html", "18.1 The plot() method", " 18.1 The plot() method The user-facing code and internal code is also separated by when they are evaluated. The user-facing code like geom_smooth() is evaluated immediately to give you a ggplot object, but the internal code is only evaluated when a ggplot object is printed or plotted, via print() and plot(). The following code simply creates a ggplot object from user-facing code, and DOES NOT print or plot the ggplot (yet). p <- ggplot(mpg, aes(displ, hwy, color = drv)) + geom_point(position = position_jitter(seed = 2022)) + geom_smooth(method = "lm", formula = y ~ x) + facet_wrap(vars(year)) + ggtitle("A plot for expository purposes") The ggplot object is actually just a list under the hood: class(p) [1] "gg" "ggplot" typeof(p) [1] "list" Evaluating the ggplot is what gives you the actual points, rectangles, text, etc. that make up the figure (and you can also do so explicitly with print()/plot()) p # print(p) # plot(p) These are two separate processes, but we often think of them as one monolithic process: defining_benchmark <- bench::mark( # Evaluates user-facing code to define ggplot, # but does not call plot/print method p <- ggplot(mpg, aes(displ, hwy, color = drv)) + geom_point(position = position_jitter(seed = 2022)) + geom_smooth(method = "lm", formula = y ~ x) + facet_wrap(vars(year)) + ggtitle("A plot for expository purposes") ) plotting_benchmark <- bench::mark( # Plots the ggplot plot(p) ) bind_rows( defining_benchmark[,2:5], plotting_benchmark[,2:5] ) # A tibble: 2 × 4 min median `itr/sec` mem_alloc <bch:tm> <bch:tm> <dbl> <bch:byt> 1 3.09ms 3.2ms 305. 20.36KB 2 234.38ms 234.4ms 4.27 3.51MB The plot that gets rendered from a ggplot object is actually a side effect of evaluating the ggplot object: # Same as ggplot2:::print.ggplot ggplot2:::plot.ggplot function (x, newpage = is.null(vp), vp = NULL, ...) { set_last_plot(x) if (newpage) grid.newpage() grDevices::recordGraphics(requireNamespace("ggplot2", quietly = TRUE), list(), getNamespace("ggplot2")) data <- ggplot_build(x) gtable <- ggplot_gtable(data) if (is.null(vp)) { grid.draw(gtable) } else { if (is.character(vp)) seekViewport(vp) else pushViewport(vp) grid.draw(gtable) upViewport() } if (isTRUE(getOption("BrailleR.VI")) && rlang::is_installed("BrailleR")) { print(asNamespace("BrailleR")$VI(x)) } invisible(x) } <bytecode: 0x55d9296d2200> <environment: namespace:ggplot2> The above code can be simplified to this: ggprint <- function(x) { data <- ggplot_build(x) gtable <- ggplot_gtable(data) grid::grid.newpage() grid::grid.draw(gtable) return(invisible(x)) #< hence "side effect" } ggprint(p) Roughly put, you first start out as the ggplot object, which then gets passed to ggplot_build(), result of which in turn gets passed to ggplot_gtable() and finally drawn with {grid} library(grid) grid.newpage() # Clear display p %>% ggplot_build() %>% # 1. data for each layer is prepared for drawing ggplot_gtable() %>% # 2. drawing-ready data is turned into graphical elements grid.draw() # 3. graphical elements are converted to an image At each step, you get closer to the low-level information you need to draw the actual plot obj_byte <- function(x) { scales::label_bytes()(as.numeric(object.size(x))) } # ggplot object p %>% obj_byte() [1] "32 kB" # data used to make graphical elements ggplot_build(p) %>% obj_byte() [1] "102 kB" # graphical elements for the plot ggplot_gtable(ggplot_build(p)) %>% obj_byte() [1] "684 kB" # the rendered plot ggsave( filename = tempfile(fileext = ".png"), plot = ggplot_gtable(ggplot_build(p)), # File size depends on format, dimension, resolution, etc. ) %>% file.size() %>% {scales::label_bytes()(.)} [1] "243 kB" The rest of the chapter focuses what happens in this pipeine - the ggplot_build() step and the ggplot_gtable() step. "],["the-build-step.html", "18.2 The build step", " 18.2 The build step This is the function body of ggplot_build(): ggplot2:::ggplot_build.ggplot function (plot) { plot <- plot_clone(plot) if (length(plot$layers) == 0) { plot <- plot + geom_blank() } layers <- plot$layers data <- rep(list(NULL), length(layers)) scales <- plot$scales data <- by_layer(function(l, d) l$layer_data(plot$data), layers, data, "computing layer data") data <- by_layer(function(l, d) l$setup_layer(d, plot), layers, data, "setting up layer") layout <- create_layout(plot$facet, plot$coordinates, plot$layout) data <- layout$setup(data, plot$data, plot$plot_env) data <- by_layer(function(l, d) l$compute_aesthetics(d, plot), layers, data, "computing aesthetics") data <- .ignore_data(data) data <- lapply(data, scales$transform_df) scale_x <- function() scales$get_scales("x") scale_y <- function() scales$get_scales("y") layout$train_position(data, scale_x(), scale_y()) data <- layout$map_position(data) data <- .expose_data(data) data <- by_layer(function(l, d) l$compute_statistic(d, layout), layers, data, "computing stat") data <- by_layer(function(l, d) l$map_statistic(d, plot), layers, data, "mapping stat to aesthetics") plot$scales$add_missing(c("x", "y"), plot$plot_env) data <- by_layer(function(l, d) l$compute_geom_1(d), layers, data, "setting up geom") data <- by_layer(function(l, d) l$compute_position(d, layout), layers, data, "computing position") data <- .ignore_data(data) layout$reset_scales() layout$train_position(data, scale_x(), scale_y()) layout$setup_panel_params() data <- layout$map_position(data) layout$setup_panel_guides(plot$guides, plot$layers) npscales <- scales$non_position_scales() if (npscales$n() > 0) { lapply(data, npscales$train_df) plot$guides <- plot$guides$build(npscales, plot$layers, plot$labels, data) data <- lapply(data, npscales$map_df) } else { plot$guides <- plot$guides$get_custom() } data <- .expose_data(data) data <- by_layer(function(l, d) l$compute_geom_2(d), layers, data, "setting up geom aesthetics") data <- by_layer(function(l, d) l$finish_statistics(d), layers, data, "finishing layer stat") data <- layout$finish_data(data) plot$labels$alt <- get_alt_text(plot) structure(list(data = data, layout = layout, plot = plot), class = "ggplot_built") } <bytecode: 0x55d91a541140> <environment: namespace:ggplot2> It takes the ggplot object as input, and transforms the user-provided data to a drawing-ready data (+ some other auxiliary/meta-data like information about the layout). You can see that the drawing-ready data data is built up incrementally (much like data wrangling minus pipes): as.list(body(ggplot2:::ggplot_build.ggplot)) [[1]] `{` [[2]] plot <- plot_clone(plot) [[3]] if (length(plot$layers) == 0) { plot <- plot + geom_blank() } [[4]] layers <- plot$layers [[5]] data <- rep(list(NULL), length(layers)) [[6]] scales <- plot$scales [[7]] data <- by_layer(function(l, d) l$layer_data(plot$data), layers, data, "computing layer data") [[8]] data <- by_layer(function(l, d) l$setup_layer(d, plot), layers, data, "setting up layer") [[9]] layout <- create_layout(plot$facet, plot$coordinates, plot$layout) [[10]] data <- layout$setup(data, plot$data, plot$plot_env) [[11]] data <- by_layer(function(l, d) l$compute_aesthetics(d, plot), layers, data, "computing aesthetics") [[12]] data <- .ignore_data(data) [[13]] data <- lapply(data, scales$transform_df) [[14]] scale_x <- function() scales$get_scales("x") [[15]] scale_y <- function() scales$get_scales("y") [[16]] layout$train_position(data, scale_x(), scale_y()) [[17]] data <- layout$map_position(data) [[18]] data <- .expose_data(data) [[19]] data <- by_layer(function(l, d) l$compute_statistic(d, layout), layers, data, "computing stat") [[20]] data <- by_layer(function(l, d) l$map_statistic(d, plot), layers, data, "mapping stat to aesthetics") [[21]] plot$scales$add_missing(c("x", "y"), plot$plot_env) [[22]] data <- by_layer(function(l, d) l$compute_geom_1(d), layers, data, "setting up geom") [[23]] data <- by_layer(function(l, d) l$compute_position(d, layout), layers, data, "computing position") [[24]] data <- .ignore_data(data) [[25]] layout$reset_scales() [[26]] layout$train_position(data, scale_x(), scale_y()) [[27]] layout$setup_panel_params() [[28]] data <- layout$map_position(data) [[29]] layout$setup_panel_guides(plot$guides, plot$layers) [[30]] npscales <- scales$non_position_scales() [[31]] if (npscales$n() > 0) { lapply(data, npscales$train_df) plot$guides <- plot$guides$build(npscales, plot$layers, plot$labels, data) data <- lapply(data, npscales$map_df) } else { plot$guides <- plot$guides$get_custom() } [[32]] data <- .expose_data(data) [[33]] data <- by_layer(function(l, d) l$compute_geom_2(d), layers, data, "setting up geom aesthetics") [[34]] data <- by_layer(function(l, d) l$finish_statistics(d), layers, data, "finishing layer stat") [[35]] data <- layout$finish_data(data) [[36]] plot$labels$alt <- get_alt_text(plot) [[37]] structure(list(data = data, layout = layout, plot = plot), class = "ggplot_built") 18.2.1 Data preparation The data from the ggplot is prepared in a special format for each layer (essentially, just a dataframe with a predictable set of column names). A layer (specifically, the output of ggplot2::layer()) can provide data in one of three ways: Inherited from the data supplied to ggplot() Supplied directly from the layer’s data argument A function that returns a data when applied to the global data data_demo_p <- ggplot(mtcars, aes(disp, cyl)) + # 1) Inherited data geom_point(color = "blue") + # 2) Data supplied directly geom_point( color = "red", alpha = .2, data = mpg %>% mutate(disp = displ * 100) ) + # 3) Function to be applied to inherited data geom_label( aes(label = paste("cyl:", cyl)), data = . %>% group_by(cyl) %>% summarize(disp = mean(disp)) ) data_demo_p Inside the layers element of the ggplot are Layer objects which hold information about each layer: data_demo_p$layers map( data_demo_p$layers, class ) And the calculated data from each layer can be accessed with layer_data() method of the Layer object: ggplot2:::Layer$layer_data <ggproto method> <Wrapper function> function (...) layer_data(..., self = self) <Inner function (f)> function (self, plot_data) { if (is.waive(self$data)) { data <- plot_data } else if (is.function(self$data)) { data <- self$data(plot_data) if (!is.data.frame(data)) { cli::cli_abort("{.fn layer_data} must return a {.cls data.frame}.") } } else { data <- self$data } if (is.null(data) || is.waive(data)) data else unrowname(data) } data_demo_p$layers[[1]]$layer_data(data_demo_p$data) mpg cyl disp hp drat wt qsec vs am gear carb 1 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4 2 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4 3 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1 4 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1 5 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2 6 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1 7 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4 8 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2 9 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2 10 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4 11 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4 12 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3 13 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3 14 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3 15 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4 16 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4 17 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4 18 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1 19 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2 20 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1 21 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1 22 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2 23 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2 24 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4 25 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2 26 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1 27 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2 28 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2 29 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4 30 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6 31 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8 32 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2 data_demo_p$layers[[2]]$layer_data(data_demo_p$data) # A tibble: 234 × 12 manufacturer model displ year cyl trans drv cty hwy fl class <chr> <chr> <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr> 1 audi a4 1.8 1999 4 auto… f 18 29 p comp… 2 audi a4 1.8 1999 4 manu… f 21 29 p comp… 3 audi a4 2 2008 4 manu… f 20 31 p comp… 4 audi a4 2 2008 4 auto… f 21 30 p comp… 5 audi a4 2.8 1999 6 auto… f 16 26 p comp… 6 audi a4 2.8 1999 6 manu… f 18 26 p comp… 7 audi a4 3.1 2008 6 auto… f 18 27 p comp… 8 audi a4 quattro 1.8 1999 4 manu… 4 18 26 p comp… 9 audi a4 quattro 1.8 1999 4 auto… 4 16 25 p comp… 10 audi a4 quattro 2 2008 4 manu… 4 20 28 p comp… # ℹ 224 more rows # ℹ 1 more variable: disp <dbl> data_demo_p$layers[[3]]$layer_data(data_demo_p$data) # A tibble: 3 × 2 cyl disp <dbl> <dbl> 1 4 105. 2 6 183. 3 8 353. This is where the data transformation journey begins inside the plot method: body(ggplot2:::ggplot_build.ggplot)[[5]] data <- rep(list(NULL), length(layers)) body(ggplot2:::ggplot_build.ggplot)[[8]] data <- by_layer(function(l, d) l$setup_layer(d, plot), layers, data, "setting up layer") For data_demo_p, the data variable after step 8 looks like this: ggtrace_inspect_vars( x = data_demo_p, method = ggplot2:::ggplot_build.ggplot, at = 9, vars = "data" ) [[1]] mpg cyl disp hp drat wt qsec vs am gear carb 1 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4 2 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4 3 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1 4 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1 5 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2 6 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1 7 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4 8 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2 9 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2 10 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4 11 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4 12 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3 13 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3 14 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3 15 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4 16 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4 17 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4 18 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1 19 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2 20 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1 21 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1 22 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2 23 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2 24 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4 25 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2 26 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1 27 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2 28 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2 29 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4 30 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6 31 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8 32 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2 [[2]] # A tibble: 234 × 12 manufacturer model displ year cyl trans drv cty hwy fl class <chr> <chr> <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr> 1 audi a4 1.8 1999 4 auto… f 18 29 p comp… 2 audi a4 1.8 1999 4 manu… f 21 29 p comp… 3 audi a4 2 2008 4 manu… f 20 31 p comp… 4 audi a4 2 2008 4 auto… f 21 30 p comp… 5 audi a4 2.8 1999 6 auto… f 16 26 p comp… 6 audi a4 2.8 1999 6 manu… f 18 26 p comp… 7 audi a4 3.1 2008 6 auto… f 18 27 p comp… 8 audi a4 quattro 1.8 1999 4 manu… 4 18 26 p comp… 9 audi a4 quattro 1.8 1999 4 auto… 4 16 25 p comp… 10 audi a4 quattro 2 2008 4 manu… 4 20 28 p comp… # ℹ 224 more rows # ℹ 1 more variable: disp <dbl> [[3]] # A tibble: 3 × 2 cyl disp <dbl> <dbl> 1 4 105. 2 6 183. 3 8 353. For the expository plot p from the book, the data variable after step 8 looks like the original mpg data: ggtrace_inspect_vars( x = p, method = ggplot2:::ggplot_build.ggplot, at = 9, vars = "data" ) %>% map(head) 18.2.2 Data transformation 18.2.2.1 PANEL variable and aesthetic mappings Continuing with the book example, the data is augmented with the PANEL variable at Step 11: body(ggplot2:::ggplot_build.ggplot)[[11]] ggtrace_inspect_vars( x = p, method = ggplot2:::ggplot_build.ggplot, at = 12, vars = "data" ) %>% map(head) And then the group variable appears at Step 12, which is also when aesthetics get “mapped” (= just mutate(), essentially): body(ggplot2:::ggplot_build.ggplot)[[12]] ggtrace_inspect_vars( x = p, method = ggplot2:::ggplot_build.ggplot, at = 13, vars = "data" ) %>% map(head) 18.2.2.2 Scales Then, scales are applied in Step 13. This leaves the data unchanged for the original plot: body(ggplot2:::ggplot_build.ggplot)[[13]] ggtrace_inspect_vars( x = p, method = ggplot2:::ggplot_build.ggplot, at = 14, vars = "data" ) %>% map(head) But the effect can be seen with something like scale_x_log10(): ggtrace_inspect_vars( x = p + scale_x_log10(), method = ggplot2:::ggplot_build.ggplot, at = 14, vars = "data" ) %>% map(head, 3) Out-of-bounds handling happens down the line, at Step 17: body(ggplot2:::ggplot_build.ggplot)[[17]] ggtrace_inspect_vars( x = p + xlim(2, 8), # or scale_x_continuous(oob = scales::oob_censor) method = ggplot2:::ggplot_build.ggplot, at = 18, vars = "data" ) %>% map(head, 3) 18.2.2.3 Stat Stat transformation happens right after, at Step 18 (this is why understanding out-of-bounds handling and scale transformation is important!) body(ggplot2:::ggplot_build.ggplot)[[18]] ggtrace_inspect_vars( x = p, method = ggplot2:::ggplot_build.ggplot, at = 19, vars = "data" ) %>% map(head, 3) Note how this point on the data for two layers look different. This is because geom_point() and geom_smooth() have different Stats. class( geom_point()$stat ) [1] "StatIdentity" "Stat" "ggproto" "gg" class( geom_smooth()$stat ) [1] "StatSmooth" "Stat" "ggproto" "gg" 18.2.2.4 Position At Step 22, positions are adjusted (jittering, dodging, stacking, etc.). We gave geom_point() a jitter so we see that reflected for the first layer: body(ggplot2:::ggplot_build.ggplot)[[22]] ggtrace_inspect_vars( x = p, method = ggplot2:::ggplot_build.ggplot, at = 23, vars = "data" ) %>% map(head, 3) 18.2.2.5 Geom Variables relevant for drawing each layer’s geometry are added in by the Geom, at Step 29: body(ggplot2:::ggplot_build.ggplot)[[29]] ggtrace_inspect_vars( x = p, method = ggplot2:::ggplot_build.ggplot, at = 30, vars = "data" ) %>% map(head, 3) 18.2.3 Output The final state of the data after ggplot_build() is stored in the data element of the output of ggplot_build(): ggplot_build(p)$data %>% map(head, 3) ggplot_build() also returns the trained layout of the plot (scales, panels, etc.) in the layout element, as well as the original ggplot object in the plot element: lapply( ggplot_build(p), class ) 18.2.4 Explore The building of p ggtrace_inspect_vars( x = p, method = ggplot2:::ggplot_build.ggplot, vars = "data" ) %>% map(map, head, 3) The making of p + scale_x_log10(limits = c(2, 5), oob = scales::oob_censor) ggtrace_inspect_vars( x = p + scale_x_log10(limits = c(2, 5), oob = scales::oob_censor), method = ggplot2:::ggplot_build.ggplot, vars = "data" ) %>% map(map, head, 3) "],["the-gtable-step.html", "18.3 The gtable step", " 18.3 The gtable step Again, still working with our plot p p <- ggplot(mpg, aes(displ, hwy, color = drv)) + geom_point(position = position_jitter(seed = 2022)) + geom_smooth(method = "lm", formula = y ~ x) + facet_wrap(vars(year)) + ggtitle("A plot for expository purposes") p # print(p) # plot(p) The return value of ggplot_build() contains the computed data associated with each layer and a Layout ggproto object which holds information about data other than the layers, including the scales, coordinate system, facets, etc. names(ggplot_build(p)) [1] "data" "layout" "plot" ggplot_build(p)$data %>% map(head, 3) class(ggplot_build(p)$layout) [1] "Layout" "ggproto" "gg" The output of ggplot_build() is then passed to ggplot_gtable() to be converted into graphical elements before being drawn: ggplot2:::plot.ggplot function (x, newpage = is.null(vp), vp = NULL, ...) { set_last_plot(x) if (newpage) grid.newpage() grDevices::recordGraphics(requireNamespace("ggplot2", quietly = TRUE), list(), getNamespace("ggplot2")) data <- ggplot_build(x) gtable <- ggplot_gtable(data) if (is.null(vp)) { grid.draw(gtable) } else { if (is.character(vp)) seekViewport(vp) else pushViewport(vp) grid.draw(gtable) upViewport() } if (isTRUE(getOption("BrailleR.VI")) && rlang::is_installed("BrailleR")) { print(asNamespace("BrailleR")$VI(x)) } invisible(x) } <bytecode: 0x55d9296d2200> <environment: namespace:ggplot2> 18.3.1 Rendering the panels First, each layer is converted into a list of graphical objects (grobs) … body(ggplot2:::ggplot_gtable.ggplot_built)[[6]] geom_grobs <- by_layer(function(l, d) l$draw_geom(d, layout), plot$layers, data, "converting geom to grob") This step draws loops through each layer, taking the layer object l and the data associated with that layer d and using the Geom from the layer to draw the data. geom_grobs <- ggtrace_inspect_vars( x = p, method = ggplot2:::ggplot_gtable.ggplot_built, at = 7, vars = "geom_grobs" ) geom_grobs [[1]] [[1]]$`1` points[geom_point.points.29302] [[1]]$`2` points[geom_point.points.29304] [[2]] [[2]]$`1` gTree[geom_smooth.gTree.29321] [[2]]$`2` gTree[geom_smooth.gTree.29338] The geom_grobs calculated at this step can also be accessed using the layer_grob() function on the ggplot object, which is similar to the layer_data() function: list( layer_grob(p, i = 1), layer_grob(p, i = 2) ) [[1]] [[1]]$`1` points[geom_point.points.29462] [[1]]$`2` points[geom_point.points.29464] [[2]] [[2]]$`1` gTree[geom_smooth.gTree.29481] [[2]]$`2` gTree[geom_smooth.gTree.29498] Each element of geom_grobs is a list of graphical objects representing a layer’s data in a facet. For example, this draws the data plotted by the first layer in the first facet grid.newpage() pushViewport(viewport()) grid.draw(geom_grobs[[1]][[1]]) After this, the facet takes over and assembles the panels… The graphical representation of each layer in each facet are combined with other “non-data” elements of the plot at this step, where the plot_table variable is defined. body(ggplot2:::ggplot_gtable.ggplot_built)[[8]] legend_box <- plot$guides$assemble(theme) plot_table <- ggtrace_inspect_vars( x = p, method = ggplot2:::ggplot_gtable.ggplot_built, at = 9, vars = "plot_table" ) plot_table is a special grob called a gtable, which is the same structure as the final form of the ggplot figure before it’s sent off to the rendering system to get drawn: plot_table TableGrob (6 x 9) "layout": 16 grobs z cells name grob 1 1 (4-4,3-3) panel-1-1 gTree[panel-1.gTree.29550] 2 1 (4-4,7-7) panel-2-1 gTree[panel-2.gTree.29564] 3 3 (2-2,3-3) axis-t-1-1 zeroGrob[NULL] 4 3 (2-2,7-7) axis-t-2-1 zeroGrob[NULL] 5 3 (5-5,3-3) axis-b-1-1 absoluteGrob[GRID.absoluteGrob.29567] 6 3 (5-5,7-7) axis-b-2-1 absoluteGrob[GRID.absoluteGrob.29567] 7 3 (4-4,6-6) axis-l-1-2 zeroGrob[NULL] 8 3 (4-4,2-2) axis-l-1-1 absoluteGrob[GRID.absoluteGrob.29573] 9 3 (4-4,8-8) axis-r-1-2 zeroGrob[NULL] 10 3 (4-4,4-4) axis-r-1-1 zeroGrob[NULL] 11 2 (3-3,3-3) strip-t-1-1 gtable[strip] 12 2 (3-3,7-7) strip-t-2-1 gtable[strip] 13 4 (1-1,3-7) xlab-t zeroGrob[NULL] 14 5 (6-6,3-7) xlab-b titleGrob[axis.title.x.bottom..titleGrob.29623] 15 6 (4-4,1-1) ylab-l titleGrob[axis.title.y.left..titleGrob.29626] 16 7 (4-4,9-9) ylab-r zeroGrob[NULL] When it is first defined, it’s only a partially complete representation of the plot - title, legend, margins, etc. are missing: grid.newpage() grid.draw(plot_table) Recall that plot_table is the output of layout$render: body(ggplot2:::ggplot_gtable.ggplot_built)[[8]] legend_box <- plot$guides$assemble(theme) This is the load-bearing step that computes/defines a bunch of smaller components internally: ggplot_build(p)$layout$render <ggproto method> <Wrapper function> function (...) render(..., self = self) <Inner function (f)> function (self, panels, data, theme, labels) { facet_bg <- self$facet$draw_back(data, self$layout, self$panel_scales_x, self$panel_scales_y, theme, self$facet_params) facet_fg <- self$facet$draw_front(data, self$layout, self$panel_scales_x, self$panel_scales_y, theme, self$facet_params) panels <- lapply(seq_along(panels[[1]]), function(i) { panel <- lapply(panels, `[[`, i) panel <- c(facet_bg[i], panel, facet_fg[i]) coord_fg <- self$coord$render_fg(self$panel_params[[i]], theme) coord_bg <- self$coord$render_bg(self$panel_params[[i]], theme) if (isTRUE(theme$panel.ontop)) { panel <- c(panel, list(coord_bg), list(coord_fg)) } else { panel <- c(list(coord_bg), panel, list(coord_fg)) } ggname(paste("panel", i, sep = "-"), gTree(children = inject(gList(!!!panel)))) }) plot_table <- self$facet$draw_panels(panels, self$layout, self$panel_scales_x, self$panel_scales_y, self$panel_params, self$coord, data, theme, self$facet_params) labels <- self$coord$labels(list(x = self$resolve_label(self$panel_scales_x[[1]], labels), y = self$resolve_label(self$panel_scales_y[[1]], labels)), self$panel_params[[1]]) labels <- self$render_labels(labels, theme) self$facet$draw_labels(plot_table, self$layout, self$panel_scales_x, self$panel_scales_y, self$panel_params, self$coord, data, theme, labels, self$params) } We can inspect these individual components: layout_render_env <- ggtrace_capture_env(p, ggplot2:::Layout$render) # grob in between the Coord's background and the layer for each panel layout_render_env$facet_bg [[1]] zeroGrob[NULL] [[2]] zeroGrob[NULL] # grob in between the Coord's foreground and the layer for each panel layout_render_env$facet_fg [[1]] zeroGrob[NULL] [[2]] zeroGrob[NULL] # individual panels (integrating the bg/fg) layout_render_env$panels [[1]] gTree[panel-1.gTree.29710] [[2]] gTree[panel-2.gTree.29724] # panels assembled into a gtable layout_render_env$plot_table TableGrob (4 x 7) "layout": 12 grobs z cells name grob 1 1 (3-3,2-2) panel-1-1 gTree[panel-1.gTree.29710] 2 1 (3-3,6-6) panel-2-1 gTree[panel-2.gTree.29724] 3 3 (1-1,2-2) axis-t-1-1 zeroGrob[NULL] 4 3 (1-1,6-6) axis-t-2-1 zeroGrob[NULL] 5 3 (4-4,2-2) axis-b-1-1 absoluteGrob[GRID.absoluteGrob.29727] 6 3 (4-4,6-6) axis-b-2-1 absoluteGrob[GRID.absoluteGrob.29727] 7 3 (3-3,5-5) axis-l-1-2 zeroGrob[NULL] 8 3 (3-3,1-1) axis-l-1-1 absoluteGrob[GRID.absoluteGrob.29733] 9 3 (3-3,7-7) axis-r-1-2 zeroGrob[NULL] 10 3 (3-3,3-3) axis-r-1-1 zeroGrob[NULL] 11 2 (2-2,2-2) strip-t-1-1 gtable[strip] 12 2 (2-2,6-6) strip-t-2-1 gtable[strip] # individual labels drawn before being added to gtable and returned layout_render_env$labels $x $x[[1]] zeroGrob[NULL] $x[[2]] titleGrob[axis.title.x.bottom..titleGrob.29783] $y $y[[1]] titleGrob[axis.title.y.left..titleGrob.29786] $y[[2]] zeroGrob[NULL] 18.3.1.1 Sneak peak: The rest of the gtable step is just updating this plot_table object. all_plot_table_versions <- ggtrace_inspect_vars( x = p, method = ggplot2:::ggplot_gtable.ggplot_built, at = "all", vars = "plot_table" ) names(all_plot_table_versions) [1] "Step8" "Step10" "Step22" "Step23" "Step24" "Step25" "Step26" "Step27" [9] "Step28" "Step29" "Step30" "Step31" "Step32" "Step33" "Step34" lapply(seq_along(all_plot_table_versions), function(i) { ggsave(tempfile(sprintf("plot_table_%02d_", i), fileext = ".png"), all_plot_table_versions[[i]]) }) dir(tempdir(), "plot_table_.*png", full.names = TRUE) %>% magick::image_read() %>% magick::image_annotate(names(all_plot_table_versions), location = "+1050+0", size = 100) %>% magick::image_write_gif("images/plot_table_animation1.gif", delay = .5) plot_table_animation1 all_plot_table_versions2 <- ggtrace_inspect_vars( x = p + labs( subtitle = "This is a subtitle", caption = "@yjunechoe", tag = "A" ) , method = ggplot2:::ggplot_gtable.ggplot_built, at = "all", vars = "plot_table" ) identical(names(all_plot_table_versions), names(all_plot_table_versions2)) lapply(seq_along(all_plot_table_versions2), function(i) { ggsave(tempfile(sprintf("plot_table2_%02d_", i), fileext = ".png"), all_plot_table_versions2[[i]]) }) dir(tempdir(), "plot_table2_.*png", full.names = TRUE) %>% magick::image_read() %>% magick::image_annotate(names(all_plot_table_versions), location = "+1050+0", size = 100) %>% magick::image_write_gif("images/plot_table_animation2.gif", delay = .5) plot_table_animation2 18.3.2 Adding guides The legend (legend_box) is first defined in Step 11: body(ggplot2:::ggplot_gtable.ggplot_built)[[11]] title_height <- grobHeight(title) legend_box <- ggtrace_inspect_vars( x = p, method = ggplot2:::ggplot_gtable.ggplot_built, at = 12, vars = "legend_box" ) grid.newpage() grid.draw(legend_box) It then undergoes some edits/tweaks, including resolving the legend.position theme setting, and then finally gets added to the plot in Step 15: body(ggplot2:::ggplot_gtable.ggplot_built)[[15]] caption_height <- grobHeight(caption) p_with_legend <- ggtrace_inspect_vars( x = p, method = ggplot2:::ggplot_gtable.ggplot_built, at = 16, vars = "plot_table" ) grid.newpage() grid.draw(p_with_legend) The bulk of the work was done in Step 11, with the build_guides() function. That in turn calls guides_train() and guides_gengrob() which in turn calls guide_train() and guide_gengrob for each scale (including positional aesthetics like x and y). Why scale? The scale is actually what holds information about guide. They’re two sides of the same coin - the scale translates the underlying data to some defined space, and the guide reverses that (translates a space to data). One’s for drawing, the other is for reading. This is also why all scale_*() functions take a guide argument. Positional scales use guide_axis() as default, and non-positional scales use guide_legend() as default. class(guide_legend()) [1] "GuideLegend" "Guide" "ggproto" "gg" # This explicitly spells out the default p + scale_color_discrete(guide = guide_legend()) This is the output of the guide_train() method defined for guide_legend(). The most important piece of it is key, which is the data associated with the legend. # TODO: The unexported function no longer exists, so we had to turn off eval. names( ggtrace_inspect_return(p, ggplot2:::guide_train.legend) ) ggtrace_inspect_return(p, ggplot2:::guide_train.legend)$key The output of guide_train() is passed to guide_gengrob(). This is the output of the guide_gebgrob() method defined for guide_legend(): # TODO: The unexported function no longer exists, so we had to turn off eval. legend_gengrob <- ggtrace_inspect_return(p, ggplot2:::guide_gengrob.legend) grid.newpage() grid.draw(legend_gengrob) 18.3.3 Adding adornment It’s everything else after the legend step that we saw in the gifs above. It looks trivial but this step we’re glossing over is ~150 lines of code. But it’s not super complicated - just a lot of if-else statements and a handful of low-level {grid} and {gtable} functions. 18.3.4 Output To put it all together: p_built <- ggplot_build(p) p_gtable <- ggplot_gtable(p_built) grid.newpage() grid.draw(p_gtable) "],["introducing-ggproto.html", "18.4 Introducing ggproto", " 18.4 Introducing ggproto It’s essentially a list of functions String <- list( add = function(x, y) paste0(x, y), subtract = function(x, y) gsub(y, "", x, fixed = TRUE), show = function(x, y) paste0(x, " and ", y) ) Number <- list( add = function(x, y) x + y, subtract = function(x, y) x - y, show = String$show ) String$add("a", "b") [1] "ab" String$subtract("june", "e") [1] "jun" String$show("ggplot", "bookclub") [1] "ggplot and bookclub" Number$add(1, 2) [1] 3 Number$subtract(10, 5) [1] 5 Number$show(1, 2) [1] "1 and 2" 18.4.1 ggproto syntax From the book: Person <- ggproto("Person", NULL, first = "", last = "", birthdate = NA, full_name = function(self) { paste(self$first, self$last) }, age = function(self) { days_old <- Sys.Date() - self$birthdate floor(as.integer(days_old) / 365.25) }, description = function(self) { paste(self$full_name(), "is", self$age(), "old") } ) 18.4.2 ggproto style guide Kind of dense - can read through on your own but most can be picked up as we read the rest of the book. "],["meeting-videos-18.html", "18.5 Meeting Videos", " 18.5 Meeting Videos 18.5.1 Cohort 1 Meeting chat log 00:58:20 Ryan S: so sorry that I have to drop off in a second. I'll look for the remaining few minutes on the video. 00:58:28 Ryan S: Thanks, June! Meeting chat log 00:49:23 June Choe: https://yjunechoe.github.io/ggtrace-user2022 00:54:24 June Choe: https://github.com/EvaMaeRey/mytidytuesday/blob/master/2022-01-03-easy-geom-recipes/easy_geom_recipes.Rmd 00:59:31 June Choe: https://www.rstudio.com/resources/rstudioconf-2020/extending-your-ability-to-extend-ggplot2/ "],["extending-ggplot2.html", "Chapter 19 Extending ggplot2", " Chapter 19 Extending ggplot2 Learning objectives: How to overcome the challenge of a particular plot Learn how to extend ggplot2 in different ways "],["overview.html", "19.1 Overview", " 19.1 Overview In this chapter we see how to extend the graphics of a plot, in particular we will see how the following layers and other key part of a plot are composed, and where the changes can be applied. Themes Stats Geoms Coords Scales Positions Facets Guides "],["themes-1.html", "19.2 Themes", " 19.2 Themes How about creating new theme elements? The base is theme is theme_grey(), then here is an example of the modification made on theme_bw() to obtain theme_minimal(). The theme_* is the easiest part of a plot to be modified. Use the print() function on a theme_<…> to see its specifications, the %+replace% operator shows where the substitutions have taken place. print(theme_grey) print(theme_minimal) and %+replace% operator print(theme_bw) function (base_size = 11, base_family = "", base_line_size = base_size/22, base_rect_size = base_size/22) { theme_grey(base_size = base_size, base_family = base_family, base_line_size = base_line_size, base_rect_size = base_rect_size) %+replace% theme(panel.background = element_rect(fill = "white", colour = NA), panel.border = element_rect(fill = NA, colour = "grey20"), panel.grid = element_line(colour = "grey92"), panel.grid.minor = element_line(linewidth = rel(0.5)), strip.background = element_rect(fill = "grey85", colour = "grey20"), complete = TRUE) } <bytecode: 0x55d928ce5090> <environment: namespace:ggplot2> print(theme_minimal) function (base_size = 11, base_family = "", base_line_size = base_size/22, base_rect_size = base_size/22) { theme_bw(base_size = base_size, base_family = base_family, base_line_size = base_line_size, base_rect_size = base_rect_size) %+replace% theme(axis.ticks = element_blank(), legend.background = element_blank(), legend.key = element_blank(), panel.background = element_blank(), panel.border = element_blank(), strip.background = element_blank(), plot.background = element_blank(), complete = TRUE) } <bytecode: 0x55d922987328> <environment: namespace:ggplot2> While if we call the function: print(theme_minimal()), we can see all the options set available. In general, if you want to make a modification to an existing theme, the general approach is to simply use the theme() function while setting complete = TRUE. "],["stats-1.html", "19.3 Stats", " 19.3 Stats Extending stats is one of the most useful ways to extend the capabilities of ggplot2 Stats are purely about data transformations Creating new stats stat with these extension functions: compute_*() setup_*() The logic of a stat is made of subsequent calls: In general the transformation is done to single group starting at the compute_group() level. Before compute_*() calls are thesetup_*() functions which allows the Stat to react and modify itself in response to the given parameters. Sometimes, with related stats, all that is necessary is to make a subclass and provide new setup_params()/setup_data() methods. print(stat_bin()) geom_bar: na.rm = FALSE, orientation = NA stat_bin: binwidth = NULL, bins = NULL, center = NULL, boundary = NULL, breaks = NULL, closed = c("right", "left"), pad = FALSE, na.rm = FALSE, orientation = NA position_stack "],["geoms-2.html", "19.4 Geoms", " 19.4 Geoms Why making a new geom_? not meaningful data by any current geoms combination of the output of multiple geoms needs for grobs not currently available from existing geoms. The logic of a geom is made of subsequent calls: Implementation is easier for draw_group() setup_params()+setup_data() overwriting the setup_data() Example Reparameterisation of geom_segment() with geom_spoke() print(GeomSpoke$setup_data) <ggproto method> <Wrapper function> function (...) setup_data(...) <Inner function (f)> function (data, params) { data$radius <- data$radius %||% params$radius data$angle <- data$angle %||% params$angle transform(data, xend = x + cos(angle) * radius, yend = y + sin(angle) * radius) } Example geom_smooth() as a combination of geom_line() and geom_ribbon() preparing the data for each of the geoms inside the draw_*() print(GeomSmooth$draw_group) <ggproto method> <Wrapper function> function (...) draw_group(...) <Inner function (f)> function (data, panel_params, coord, lineend = "butt", linejoin = "round", linemitre = 10, se = FALSE, flipped_aes = FALSE) { ribbon <- transform(data, colour = NA) path <- transform(data, alpha = NA) ymin = flipped_names(flipped_aes)$ymin ymax = flipped_names(flipped_aes)$ymax has_ribbon <- se && !is.null(data[[ymax]]) && !is.null(data[[ymin]]) gList(if (has_ribbon) GeomRibbon$draw_group(ribbon, panel_params, coord, flipped_aes = flipped_aes), GeomLine$draw_panel(path, panel_params, coord, lineend = lineend, linejoin = linejoin, linemitre = linemitre)) } "],["coords.html", "19.5 Coords", " 19.5 Coords Example: CoordCartesian rescaling the position data Coords takes care of rendering the axes, axis labels, and panel foreground and background and it can intercept both the layer data and facet layout and modify it, with: draw_*() transform() Example print(CoordCartesian$transform) <ggproto method> <Wrapper function> function (...) transform(...) <Inner function (f)> function (data, panel_params) { data <- transform_position(data, panel_params$x$rescale, panel_params$y$rescale) transform_position(data, squish_infinite, squish_infinite) } print(coord_sf) "],["scales-1.html", "19.6 Scales", " 19.6 Scales Example Build a wrapper for a new palette to an existing scale. This is done by providing a new palette scale into the relevant basic scale. print(scale_fill_viridis_c) function (name = waiver(), ..., alpha = 1, begin = 0, end = 1, direction = 1, option = "D", values = NULL, space = "Lab", na.value = "grey50", guide = "colourbar", aesthetics = "fill") { continuous_scale(aesthetics, name = name, palette = pal_gradient_n(pal_viridis(alpha, begin, end, direction, option)(6), values, space), na.value = na.value, guide = guide, ...) } <bytecode: 0x55d92cb2eea8> <environment: namespace:ggplot2> "],["other-important-parts.html", "19.7 Other important parts", " 19.7 Other important parts 19.7.1 Positions The Position class is slightly simpler than the other ggproto classes. 19.7.2 Facets Look at FacetWrap or FacetGrid, and simply provide new compute_layout(), and map_data() methods 19.7.3 Guides What is a ggproto? The answer is back in chapter20 ggplot2 internals "],["references-1.html", "19.8 References", " 19.8 References Extending ggplot2 A List of ggplot2 extensions ggplot Extension Course "],["meeting-videos-19.html", "19.9 Meeting Videos", " 19.9 Meeting Videos 19.9.1 Cohort 1 Meeting chat log 00:10:52 June Choe: I'm fine with anything! 00:39:47 Federica Gazzelloni: - [Extending ggplot2](https://ggplot2.tidyverse.org/articles/extending-ggplot2.html) - [A List of ggplot2 extensions](https://exts.ggplot2.tidyverse.org/) - [ggplot Extension Course](https://mq-software-carpentry.github.io/r-ggplot-extension/aio.html) - [Example](https://github.com/EvaMaeRey/mytidytuesday/blob/master/2022-01-03-easy-geom-recipes/easy_geom_recipes.Rmd) - [extending-your-ability-to-extend-ggplot2](https://www.rstudio.com/resources/rstudioconf-2020/extending-your-ability-to-extend-ggplot2/) - [ggtrace](https://yjunechoe.github.io/ggtrace-user2022/#/title-slide) "],["a-case-study.html", "Chapter 20 A case study", " Chapter 20 A case study Learning objectives: {These are nice to have, but take some extra work. It’s ok to skip these if necessary.} "],["slide-1-title.html", "20.1 {Slide 1 title}", " 20.1 {Slide 1 title} {Create slides as sections marked with ##, but keep them short like a slide.} "],["meeting-videos-20.html", "20.2 Meeting Videos", " 20.2 Meeting Videos 20.2.1 Cohort 1 Meeting chat log "],["mastering-the-grammar.html", "Chapter 21 Mastering the Grammar", " Chapter 21 Mastering the Grammar This was previously covered as chapter 13, but it does not exist as a separate chapter in the current version of the book. Learning Objectives Review the elements and benefits of the grammar of graphics Be able to break down simple graphics into its component parts Mapping Coordinates; define and itdentify layer and scaling as well as coordinate and faceting Create a process for integrating the grammar into your visual design Apply the grammar to the analysis of existing graphics. References Wickham, H. (2010). A layered grammar of graphics . Journal of Computational and Graphical Statistics, Volume 19, Number 1, 3–28. "],["introduction-6.html", "21.1 Introduction", " 21.1 Introduction Definition of a grammar: “the fundamental principles or rules of an art or science” (OED Online 1989). “In order to unlock the full power of ggplot2, you’ll need to master the underlying grammar. By understanding the grammar, and how its components fit together, you can create a wider range of visualizations, combine multiple sources of data, and customise to your heart’s content.” “The next chapters discuss the components in more detail, and provide more examples of how you can use them in practice.” Grammar versus chart heuristics. Often we match data type to a standard chart type (for example: bar chart for categorical comparisions). 4 parts of a Layer Data and aesthetic mapping. “Along with the data, we need a specification of which variables are mapped to which aesthetics.” (Wickham, 2010, p. 10) Stat. “A statistical transformation, or stat, transforms the data, typically by summarizing them in some manner…A statistical transformation, or stat, transforms the data, typically by summarizing them in some manner.” (Wickham, 2010, p. 10) Geom. “Geometric objects, or geoms for short, control the type of plot that you create. For example, using a point geom will create a scatterplot, whereas using a line geom will create a line plot. We can classify geoms by their dimensionality: • 0d: point, text, • 1d: path, line (ordered path), • 2d: polygon, interval.” (Wickham, 2010, p. 11) Position adjustment Examples include geom_jitter or how bar plots adjust so the lines do not overlap. Review of key terms Geom: point, bar, boxplot, line Aesthetics: size, color, shape, position Aesthetics finder Benefits of using the Grammar Allows one to iterate in the creation and/or updating of a plot. Gives a language for viewing, and learning from, existing data viz. Enables a better process by focusing the viz developer on the intended purpose of the visual/analysis (not just matching a chart to data). Expands data viz beyond just how to use this particular software syntax. "],["building-a-scatterplot.html", "21.2 Building a scatterplot", " 21.2 Building a scatterplot ggplot(mpg, aes(displ, hwy, colour = factor(cyl))) + geom_line() + theme(legend.position = "none") ggplot(mpg, aes(displ, hwy, colour = factor(cyl))) + geom_bar(stat = "identity", position = "identity", fill = NA) + theme(legend.position = "none") ggplot(mpg, aes(displ, hwy, colour = factor(cyl))) + geom_point() + geom_smooth(method = "lm") + labs(title = "What type of graph would you call this?", subtitle = "Notice the defaults of ggplot2") + theme(plot.title = element_text(size = 15, color = "firebrick", face = "bold", hjust = .5)) + theme(plot.subtitle = element_text(hjust = .5)) "],["scaling.html", "21.3 Scaling", " 21.3 Scaling “The values in the previous table have no meaning to the computer. We need to convert them from data units (e.g., litres, miles per gallon and number of cylinders) to graphical units (e.g., pixels and colours) that the computer can display. This conversion process is called scaling and performed by scales.” what we see to what the computer reads we see colours; computer reads hexadecimal string we see size; computer reads a number we see shapes; the computer reads an integer Example in Page 4-6 of Wickham, H. (2010) “Scales typically map from a single variable to a single aesthetic, but there are exceptions. For example, we can map one variable to hue and another to saturation, to create a single aesthetic, color. We can also create redundant mappings, mapping the same variable to multiple aesthetics.” (Wickham, 2010, p. 13) These aesthetic specifications that are meaningful to R are described in vignette(“ggplot2-specs”) Shape Shapes take five types of values: An integer in [0,25]: shapes <- data.frame( shape = c(0:19, 22, 21, 24, 23, 20), x = 0:24 %/% 5, y = -(0:24 %% 5) ) ggplot(shapes, aes(x, y)) + geom_point(aes(shape = shape), size = 5, fill = "red") + geom_text(aes(label = shape), hjust = 0, nudge_x = 0.15) + scale_shape_identity() + expand_limits(x = 4.1) + theme_void() Line type Line types can be specified with: An integer or name: 0 = blank, 1 = solid, 2 = dashed, 3 = dotted, 4 = dotdash, 5 = longdash, 6 = twodash, as shown below: lty <- c("solid", "dashed", "dotted", "dotdash", "longdash", "twodash") linetypes <- data.frame( y = seq_along(lty), lty = lty ) ggplot(linetypes, aes(0, y)) + geom_segment(aes(xend = 5, yend = y, linetype = lty)) + scale_linetype_identity() + geom_text(aes(label = lty), hjust = 0, nudge_y = 0.2) + scale_x_continuous(NULL, breaks = NULL) + scale_y_reverse(NULL, breaks = NULL) Font face There are only three fonts that are guaranteed to work everywhere: “sans” (the default), “serif”, or “mono”: df <- data.frame(x = 1, y = 3:1, family = c("sans", "serif", "mono")) ggplot(df, aes(x, y)) + geom_text(aes(label = family, family = family)) Colour and fill Note that shapes 21-24 have both stroke colour and a fill. The size of the filled part is controlled by size, the size of the stroke is controlled by stroke. Each is measured in mm, and the total size of the point is the sum of the two. Note that the size is constant along the diagonal in the following figure. sizes <- expand.grid(size = (0:3) * 2, stroke = (0:3) * 2) ggplot(sizes, aes(size, stroke, size = size, stroke = stroke)) + geom_abline(slope = -1, intercept = 6, colour = "white", size = 6) + geom_point(shape = 21, fill = "red") + scale_size_identity() Horizontal and vertical justification have the same parameterisation, either a string (“top”, “middle”, “bottom”, “left”, “center”, “right”) or a number between 0 and 1: top = 1, middle = 0.5, bottom = 0 left = 0, center = 0.5, right = 1 just <- expand.grid(hjust = c(0, 0.5, 1), vjust = c(0, 0.5, 1)) just$label <- paste0(just$hjust, ", ", just$vjust) ggplot(just, aes(hjust, vjust)) + geom_point(colour = "grey70", size = 5) + geom_text(aes(label = label, hjust = hjust, vjust = vjust)) "],["adding-complexity-faceting-coordinates-hierarchy-of-defaults.html", "21.4 Adding complexity; faceting, coordinates, hierarchy of defaults", " 21.4 Adding complexity; faceting, coordinates, hierarchy of defaults facets, multiple layers and statistics ggplot(mpg, aes(displ, hwy)) + geom_point() + geom_smooth() + facet_wrap(~year) “A coordinate system, coord for short, maps the position of objects onto the plane of the plot. Position is often specified by two coordinates (x, y), but could be any number of coordinates. The Cartesian coordinate system is the most common coordinate system for two dimensions, whereas polar coordinates and various map projections are used less frequently.” (Wickham, 2010, p. 13) “Coordinate systems affect all position variables simultaneously and differ from scales in that they also change the appearance of the geometric objects. For example, in polar coordinates, bar geoms look like segments of a circle. Additionally, scaling is performed before statistical transformation, whereas coordinate transformations occur afterward.” (Wickham, 2010, p. 13) Coord_polar ggplot(mpg, aes(displ, hwy, colour = factor(cyl))) + geom_bar(stat = "identity", position = "identity", fill = NA) + theme(legend.position = "none") + coord_polar() “The angle component is particularly useful for cyclical data because the starting and ending points of a single cycle are adjacent. Common cyclical variables are components of dates, like days of the year or hours of the day, and angles, like wind direction.” (Wickham, 2010, p. 22) “In the grammar, a pie chart is a stacked bar geom drawn in a polar coordinate system.” (Wickham, 2010, p. 22) ggplot(diamonds,aes(x = "", fill=clarity)) + geom_bar(width = 1) + coord_polar (theta="y") Figure 15 shows this, as well as a bullseye plot, which arises when we map the height to radius instead of angle. (Wickham, 2010, p. 22) ggplot(diamonds,aes(x = "", fill=clarity)) + geom_bar(width = 1) + coord_polar (theta="x") The Coxcomb plot is a bar chart in polar coordinates. Note that the categories abut in the Coxcomb, but are separated in the bar chart: this is an example of a graphical convention that differs in different coordinate systems. (Wickham, 2010, p. 23) library(patchwork) a <- ggplot(diamonds,aes(x = clarity, fill=clarity)) + geom_bar(width = 1) + theme(legend.position = "none") b <- ggplot(diamonds,aes(x = clarity, fill=clarity)) + geom_bar(width = 1) + coord_polar (theta="y") + theme(legend.position = "none") a + b Defaults The full ggplot2 specification of the scatterplot of price versus weight is: ggplot() + layer( data = diamonds, mapping = aes(x = carat, y = price), geom = "point", stat = "identity", position = "identity" ) + scale_y_continuous() + scale_x_continuous() + coord_cartesian() "],["process-and-examples.html", "21.5 Process and Examples", " 21.5 Process and Examples Process Start with business or research question and purpose Write out grammar Think through chart types, geom options Iterate In the Jan 3, 2022 video, Statistical Rethinking 2022 Lecture 01 Richard McElreath describes a research process (see 19 minute mark): Theoretical Estimand The Scientific (causal) model(s) Use 1 & 2 to build statistical model(s) Simulate from 2 to validate 3 yields 1 Analze real data Does this translate to a data viz process? Apply the grammar to data viz examples The chapter gives 7 examples inclinding “Napoleon’s march” by Charles John Minard which is also covered in the A Layered Grammar of Graphics article. We will look at examples from here: Our 51 Best (And Weirdest) Charts Of 2021 by FiveThirtyEight Staff (Published Dec. 20, 2021) Resources Wickham, H. (2010). A layered grammar of graphics . Journal of Computational and Graphical Statistics, Volume 19, Number 1, 3–28. Chapter 2 of Fundamentals of Data Visualization by Claus O. Wilke gives an overview of Mapping data onto aesthetics and chapter 3 is on Coordinate systems and axes. "],["meeting-videos-21.html", "21.6 Meeting Videos", " 21.6 Meeting Videos 21.6.1 Cohort 1 "],["404.html", "Page not found", " Page not found The page you requested cannot be found (perhaps it was moved or renamed). You may want to try searching to find the page's new location, or use the table of contents to find the page you are looking for. "]]
diff --git a/shape.html b/shape.html
index fd640da0..a5dfd0d9 100644
--- a/shape.html
+++ b/shape.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/simple-features-maps.html b/simple-features-maps.html
index 66925834..314570ea 100644
--- a/simple-features-maps.html
+++ b/simple-features-maps.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/size.html b/size.html
index 8c82aded..b7048acd 100644
--- a/size.html
+++ b/size.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/slide-1-title.html b/slide-1-title.html
index a94fed54..ae28ed4b 100644
--- a/slide-1-title.html
+++ b/slide-1-title.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/statistical-summaries-1.html b/statistical-summaries-1.html
index 7511f51a..6c206645 100644
--- a/statistical-summaries-1.html
+++ b/statistical-summaries-1.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -597,22 +603,22 @@
4.6 Statistical Summaries
geom_histogram()
and geom_bin2d()
use a familiar geom, geom_bar()
and geom_raster()
, combined with a new statistical transformation, stat_bin()
and stat_bin2d()
. stat_bin()
and stat_bin2d()
combine the data into bins and count the number of observations in each bin. But what if we want a summary other than count? So far, we’ve just used the default statistical transformation associated with each geom. Now we’re going to explore how to use stat_summary_bin()
to stat_summary_2d()
to compute different summaries.
-
+
-
+
-ggplot(diamonds, aes(table, depth)) +
- geom_bin2d(binwidth = 1, na.rm = TRUE) +
- xlim(50, 70) +
- ylim(50, 70)
+ggplot(diamonds, aes(table, depth)) +
+ geom_bin2d(binwidth = 1, na.rm = TRUE) +
+ xlim(50, 70) +
+ ylim(50, 70)
-ggplot(diamonds, aes(table, depth, z = price)) +
- geom_raster(binwidth = 1, stat = "summary_2d", fun = mean,
- na.rm = TRUE) +
- xlim(50, 70) +
- ylim(50, 70)
+ggplot(diamonds, aes(table, depth, z = price)) +
+ geom_raster(binwidth = 1, stat = "summary_2d", fun = mean,
+ na.rm = TRUE) +
+ xlim(50, 70) +
+ ylim(50, 70)
## Warning: Raster pixels are placed at uneven horizontal intervals and will be shifted
## ℹ Consider using `geom_tile()` instead.
## Raster pixels are placed at uneven horizontal intervals and will be shifted
@@ -624,22 +630,22 @@ 4.6 Statistical SummariesStatistical geoms where introduce a layer of statistical summaries in between the raw data and the result
Although ggplot2 does not have direct 3d support, it does provide the ability to plot 2d images representing 3d data. These include: contours, colored tiles, and bubble plots.
-
+
## Warning: The dot-dot notation (`..level..`) was deprecated in ggplot2 3.4.0.
## ℹ Please use `after_stat(level)` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
-
+
-
+
diff --git a/statistical-summaries.html b/statistical-summaries.html
index ee208121..cce53faf 100644
--- a/statistical-summaries.html
+++ b/statistical-summaries.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/stats-1.html b/stats-1.html
index 6a2f6538..a69c3da9 100644
--- a/stats-1.html
+++ b/stats-1.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -619,7 +625,7 @@
19.3 Statsprint(stat_bin())
+
geom_bar: na.rm = FALSE, orientation = NA
stat_bin: binwidth = NULL, bins = NULL, center = NULL, boundary = NULL, breaks = NULL, closed = c("right", "left"), pad = FALSE, na.rm = FALSE, orientation = NA
position_stack
diff --git a/stats.html b/stats.html
index 5f341f03..2ed5f674 100644
--- a/stats.html
+++ b/stats.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/text-labels.html b/text-labels.html
index dd6adede..ab58495a 100644
--- a/text-labels.html
+++ b/text-labels.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -606,46 +612,46 @@
7.3 Text labelsgeom_text(aes(label = text), vjust = “inward”, hjust = “inward”)
-df <- data.frame(x = 1, y = 3:1, face = c("plain", "bold", "italic"))
-ggplot(df, aes(x, y)) +
- geom_text(aes(label = face, fontface = face, ),
- vjust = "inward", hjust = "inward", size = 20, angle = 10)
-
-
-
-
-
-
-
-
-
-
-
-library(ggrepel)
-ggplot(mpg, aes(displ, hwy)) +
- geom_text_repel(aes(label = model)) +
- xlim(1, 8)
-
-label <- data.frame(
- waiting = c(55, 80),
- eruptions = c(2, 4.3),
- label = c("peak one", "peak two")
-)
-
-ggplot(faithfuld, aes(waiting, eruptions)) +
- geom_tile(aes(fill = density)) +
- geom_label(data = label, aes(label = label))
-
+df <- data.frame(x = 1, y = 3:1, face = c("plain", "bold", "italic"))
+ggplot(df, aes(x, y)) +
+ geom_text(aes(label = face, fontface = face, ),
+ vjust = "inward", hjust = "inward", size = 20, angle = 10)
+
+
+
+
+
+
+
+
+
+
+
+library(ggrepel)
+ggplot(mpg, aes(displ, hwy)) +
+ geom_text_repel(aes(label = model)) +
+ xlim(1, 8)
+
+label <- data.frame(
+ waiting = c(55, 80),
+ eruptions = c(2, 4.3),
+ label = c("peak one", "peak two")
+)
+
+ggplot(faithfuld, aes(waiting, eruptions)) +
+ geom_tile(aes(fill = density)) +
+ geom_label(data = label, aes(label = label))
+
geom_label
-
+
diff --git a/scatterplot.html b/the-basics.html
similarity index 93%
rename from scatterplot.html
rename to the-basics.html
index 32a9250a..acfe75af 100644
--- a/scatterplot.html
+++ b/the-basics.html
@@ -4,18 +4,18 @@
- 2.1 Scatterplot: | ggplot2 Book Club
+ 2.1 The basics | ggplot2 Book Club
-
+
-
+
@@ -23,7 +23,7 @@
-
+
@@ -31,7 +31,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -594,11 +600,16 @@
-
-2.1 Scatterplot:
-
-
+
+2.1 The basics
+
+- Each geom can be useful by itself.
+- Geoms can be used in ways to construct more complex geoms.
+- The geoms discussed in this chapter are two dimensional (e.g.,
x
and y
).
+- All geoms understand
color
or colour
and size aesthetics.
+- Bar, tile, and polygon understand
fill
.
+- The terms above are all parameters within ggplot2 functions.
+
@@ -606,7 +617,7 @@ 2.1 Scatterplot:
-
+
diff --git a/the-build-step.html b/the-build-step.html
index cf2262d2..cfff9758 100644
--- a/the-build-step.html
+++ b/the-build-step.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -597,7 +603,7 @@
18.2 The build step
This is the function body of ggplot_build()
:
-
+
function (plot)
{
plot <- plot_clone(plot)
@@ -657,10 +663,10 @@ 18.2 The build stepas.list(body(ggplot2:::ggplot_build.ggplot))
+
[[1]]
`{`
@@ -798,31 +804,31 @@ 18.2.1 Data preparationdata_demo_p <- ggplot(mtcars, aes(disp, cyl)) +
- # 1) Inherited data
- geom_point(color = "blue") +
- # 2) Data supplied directly
- geom_point(
- color = "red", alpha = .2,
- data = mpg %>%
- mutate(disp = displ * 100)
- ) +
- # 3) Function to be applied to inherited data
- geom_label(
- aes(label = paste("cyl:", cyl)),
- data = . %>%
- group_by(cyl) %>%
- summarize(disp = mean(disp))
- )
-
-data_demo_p
+data_demo_p <- ggplot(mtcars, aes(disp, cyl)) +
+ # 1) Inherited data
+ geom_point(color = "blue") +
+ # 2) Data supplied directly
+ geom_point(
+ color = "red", alpha = .2,
+ data = mpg %>%
+ mutate(disp = displ * 100)
+ ) +
+ # 3) Function to be applied to inherited data
+ geom_label(
+ aes(label = paste("cyl:", cyl)),
+ data = . %>%
+ group_by(cyl) %>%
+ summarize(disp = mean(disp))
+ )
+
+data_demo_p
Inside the layers
element of the ggplot are Layer
objects which hold information about each layer:
-
+
And the calculated data from each layer can be accessed with layer_data()
method of the Layer
object:
-
+
<ggproto method>
<Wrapper function>
function (...)
@@ -847,7 +853,7 @@ 18.2.1 Data preparationdata_demo_p$layers[[1]]$layer_data(data_demo_p$data)
+
mpg cyl disp hp drat wt qsec vs am gear carb
1 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
2 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
@@ -881,7 +887,7 @@ 18.2.1 Data preparationdata_demo_p$layers[[2]]$layer_data(data_demo_p$data)
+
# A tibble: 234 × 12
manufacturer model displ year cyl trans drv cty hwy fl class
<chr> <chr> <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr>
@@ -897,7 +903,7 @@ 18.2.1 Data preparationdata_demo_p$layers[[3]]$layer_data(data_demo_p$data)
+
# A tibble: 3 × 2
cyl disp
<dbl> <dbl>
@@ -905,18 +911,18 @@ 18.2.1 Data preparationbody(ggplot2:::ggplot_build.ggplot)[[5]]
+
data <- rep(list(NULL), length(layers))
-
+
data <- by_layer(function(l, d) l$setup_layer(d, plot), layers,
data, "setting up layer")
For data_demo_p
, the data
variable after step 8 looks like this:
-ggtrace_inspect_vars(
- x = data_demo_p,
- method = ggplot2:::ggplot_build.ggplot,
- at = 9,
- vars = "data"
-)
+ggtrace_inspect_vars(
+ x = data_demo_p,
+ method = ggplot2:::ggplot_build.ggplot,
+ at = 9,
+ vars = "data"
+)
[[1]]
mpg cyl disp hp drat wt qsec vs am gear carb
1 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
@@ -977,122 +983,122 @@ 18.2.1 Data preparationggtrace_inspect_vars(
- x = p,
- method = ggplot2:::ggplot_build.ggplot,
- at = 9,
- vars = "data"
-) %>%
- map(head)
+ggtrace_inspect_vars(
+ x = p,
+ method = ggplot2:::ggplot_build.ggplot,
+ at = 9,
+ vars = "data"
+) %>%
+ map(head)
18.2.2 Data transformation
18.2.2.1 PANEL variable and aesthetic mappings
Continuing with the book example, the data is augmented with the PANEL
variable at Step 11:
-body(ggplot2:::ggplot_build.ggplot)[[11]]
-
-ggtrace_inspect_vars(
- x = p,
- method = ggplot2:::ggplot_build.ggplot,
- at = 12,
- vars = "data"
-) %>%
- map(head)
+body(ggplot2:::ggplot_build.ggplot)[[11]]
+
+ggtrace_inspect_vars(
+ x = p,
+ method = ggplot2:::ggplot_build.ggplot,
+ at = 12,
+ vars = "data"
+) %>%
+ map(head)
And then the group
variable appears at Step 12, which is also when aesthetics get “mapped” (= just mutate()
, essentially):
-
+
18.2.2.2 Scales
Then, scales are applied in Step 13. This leaves the data unchanged for the original plot:
-body(ggplot2:::ggplot_build.ggplot)[[13]]
-
-ggtrace_inspect_vars(
- x = p,
- method = ggplot2:::ggplot_build.ggplot,
- at = 14,
- vars = "data"
-) %>%
- map(head)
+body(ggplot2:::ggplot_build.ggplot)[[13]]
+
+ggtrace_inspect_vars(
+ x = p,
+ method = ggplot2:::ggplot_build.ggplot,
+ at = 14,
+ vars = "data"
+) %>%
+ map(head)
But the effect can be seen with something like scale_x_log10()
:
-ggtrace_inspect_vars(
- x = p + scale_x_log10(),
- method = ggplot2:::ggplot_build.ggplot,
- at = 14,
- vars = "data"
-) %>%
- map(head, 3)
+ggtrace_inspect_vars(
+ x = p + scale_x_log10(),
+ method = ggplot2:::ggplot_build.ggplot,
+ at = 14,
+ vars = "data"
+) %>%
+ map(head, 3)
Out-of-bounds handling happens down the line, at Step 17:
-
+
18.2.2.3 Stat
Stat transformation happens right after, at Step 18 (this is why understanding out-of-bounds handling and scale transformation is important!)
-body(ggplot2:::ggplot_build.ggplot)[[18]]
-
-ggtrace_inspect_vars(
- x = p,
- method = ggplot2:::ggplot_build.ggplot,
- at = 19,
- vars = "data"
-) %>%
- map(head, 3)
+body(ggplot2:::ggplot_build.ggplot)[[18]]
+
+ggtrace_inspect_vars(
+ x = p,
+ method = ggplot2:::ggplot_build.ggplot,
+ at = 19,
+ vars = "data"
+) %>%
+ map(head, 3)
Note how this point on the data for two layers look different. This is because geom_point()
and geom_smooth()
have different Stats.
-
+
[1] "StatIdentity" "Stat" "ggproto" "gg"
-
+
[1] "StatSmooth" "Stat" "ggproto" "gg"
18.2.2.4 Position
At Step 22, positions are adjusted (jittering, dodging, stacking, etc.). We gave geom_point()
a jitter so we see that reflected for the first layer:
-
+
18.2.3 Output
The final state of the data after ggplot_build()
is stored in the data
element of the output of ggplot_build()
:
-
+
ggplot_build()
also returns the trained layout of the plot (scales, panels, etc.) in the layout
element, as well as the original ggplot object in the plot
element:
-
+
18.2.4 Explore
diff --git a/the-gtable-step.html b/the-gtable-step.html
index 48e9e633..53846b5a 100644
--- a/the-gtable-step.html
+++ b/the-gtable-step.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -597,23 +603,23 @@
18.3 The gtable step
Again, still working with our plot p
-p <- ggplot(mpg, aes(displ, hwy, color = drv)) +
- geom_point(position = position_jitter(seed = 2022)) +
- geom_smooth(method = "lm", formula = y ~ x) +
- facet_wrap(vars(year)) +
- ggtitle("A plot for expository purposes")
-
+p <- ggplot(mpg, aes(displ, hwy, color = drv)) +
+ geom_point(position = position_jitter(seed = 2022)) +
+ geom_smooth(method = "lm", formula = y ~ x) +
+ facet_wrap(vars(year)) +
+ ggtitle("A plot for expository purposes")
+
-
+
The return value of ggplot_build()
contains the computed data associated with each layer and a Layout
ggproto object which holds information about data other than the layers, including the scales, coordinate system, facets, etc.
-
+
[1] "data" "layout" "plot"
-
-
+
+
[1] "Layout" "ggproto" "gg"
The output of ggplot_build()
is then passed to ggplot_gtable()
to be converted into graphical elements before being drawn:
-
+
function (x, newpage = is.null(vp), vp = NULL, ...)
{
set_last_plot(x)
@@ -638,97 +644,97 @@ 18.3 The gtable step
18.3.1 Rendering the panels
First, each layer is converted into a list of graphical objects (grobs
) …
-
+
geom_grobs <- by_layer(function(l, d) l$draw_geom(d, layout),
plot$layers, data, "converting geom to grob")
This step draws loops through each layer, taking the layer object l
and the data associated with that layer d
and using the Geom from the layer to draw the data.
-geom_grobs <- ggtrace_inspect_vars(
- x = p, method = ggplot2:::ggplot_gtable.ggplot_built,
- at = 7, vars = "geom_grobs"
-)
-geom_grobs
+geom_grobs <- ggtrace_inspect_vars(
+ x = p, method = ggplot2:::ggplot_gtable.ggplot_built,
+ at = 7, vars = "geom_grobs"
+)
+geom_grobs
[[1]]
[[1]]$`1`
-points[geom_point.points.28774]
+points[geom_point.points.29302]
[[1]]$`2`
-points[geom_point.points.28776]
+points[geom_point.points.29304]
[[2]]
[[2]]$`1`
-gTree[geom_smooth.gTree.28793]
+gTree[geom_smooth.gTree.29321]
[[2]]$`2`
-gTree[geom_smooth.gTree.28810]
+gTree[geom_smooth.gTree.29338]
The geom_grobs
calculated at this step can also be accessed using the layer_grob()
function on the ggplot object, which is similar to the layer_data()
function:
-
+
[[1]]
[[1]]$`1`
-points[geom_point.points.28934]
+points[geom_point.points.29462]
[[1]]$`2`
-points[geom_point.points.28936]
+points[geom_point.points.29464]
[[2]]
[[2]]$`1`
-gTree[geom_smooth.gTree.28953]
+gTree[geom_smooth.gTree.29481]
[[2]]$`2`
-gTree[geom_smooth.gTree.28970]
+gTree[geom_smooth.gTree.29498]
Each element of geom_grobs
is a list of graphical objects representing a layer’s data in a facet. For example, this draws the data plotted by the first layer in the first facet
-
+
After this, the facet takes over and assembles the panels…
The graphical representation of each layer in each facet are combined with other “non-data” elements of the plot at this step, where the plot_table
variable is defined.
-
+
legend_box <- plot$guides$assemble(theme)
-plot_table <- ggtrace_inspect_vars(
- x = p, method = ggplot2:::ggplot_gtable.ggplot_built,
- at = 9, vars = "plot_table"
-)
+plot_table <- ggtrace_inspect_vars(
+ x = p, method = ggplot2:::ggplot_gtable.ggplot_built,
+ at = 9, vars = "plot_table"
+)
plot_table
is a special grob
called a gtable
, which is the same structure as the final form of the ggplot figure before it’s sent off to the rendering system to get drawn:
-
+
TableGrob (6 x 9) "layout": 16 grobs
z cells name grob
-1 1 (4-4,3-3) panel-1-1 gTree[panel-1.gTree.29022]
-2 1 (4-4,7-7) panel-2-1 gTree[panel-2.gTree.29036]
+1 1 (4-4,3-3) panel-1-1 gTree[panel-1.gTree.29550]
+2 1 (4-4,7-7) panel-2-1 gTree[panel-2.gTree.29564]
3 3 (2-2,3-3) axis-t-1-1 zeroGrob[NULL]
4 3 (2-2,7-7) axis-t-2-1 zeroGrob[NULL]
-5 3 (5-5,3-3) axis-b-1-1 absoluteGrob[GRID.absoluteGrob.29039]
-6 3 (5-5,7-7) axis-b-2-1 absoluteGrob[GRID.absoluteGrob.29039]
+5 3 (5-5,3-3) axis-b-1-1 absoluteGrob[GRID.absoluteGrob.29567]
+6 3 (5-5,7-7) axis-b-2-1 absoluteGrob[GRID.absoluteGrob.29567]
7 3 (4-4,6-6) axis-l-1-2 zeroGrob[NULL]
-8 3 (4-4,2-2) axis-l-1-1 absoluteGrob[GRID.absoluteGrob.29045]
+8 3 (4-4,2-2) axis-l-1-1 absoluteGrob[GRID.absoluteGrob.29573]
9 3 (4-4,8-8) axis-r-1-2 zeroGrob[NULL]
10 3 (4-4,4-4) axis-r-1-1 zeroGrob[NULL]
11 2 (3-3,3-3) strip-t-1-1 gtable[strip]
12 2 (3-3,7-7) strip-t-2-1 gtable[strip]
13 4 (1-1,3-7) xlab-t zeroGrob[NULL]
-14 5 (6-6,3-7) xlab-b titleGrob[axis.title.x.bottom..titleGrob.29095]
-15 6 (4-4,1-1) ylab-l titleGrob[axis.title.y.left..titleGrob.29098]
+14 5 (6-6,3-7) xlab-b titleGrob[axis.title.x.bottom..titleGrob.29623]
+15 6 (4-4,1-1) ylab-l titleGrob[axis.title.y.left..titleGrob.29626]
16 7 (4-4,9-9) ylab-r zeroGrob[NULL]
When it is first defined, it’s only a partially complete representation of the plot - title, legend, margins, etc. are missing:
-
+
Recall that plot_table
is the output of layout$render
:
-
+
legend_box <- plot$guides$assemble(theme)
This is the load-bearing step that computes/defines a bunch of smaller components internally:
-
+
<ggproto method>
<Wrapper function>
function (...)
@@ -768,100 +774,100 @@ 18.3.1 Rendering the panels
We can inspect these individual components:
-
-
+
+
[[1]]
zeroGrob[NULL]
[[2]]
zeroGrob[NULL]
-
+
[[1]]
zeroGrob[NULL]
[[2]]
zeroGrob[NULL]
-
+
[[1]]
-gTree[panel-1.gTree.29182]
+gTree[panel-1.gTree.29710]
[[2]]
-gTree[panel-2.gTree.29196]
-
+gTree[panel-2.gTree.29724]
+
TableGrob (4 x 7) "layout": 12 grobs
z cells name grob
-1 1 (3-3,2-2) panel-1-1 gTree[panel-1.gTree.29182]
-2 1 (3-3,6-6) panel-2-1 gTree[panel-2.gTree.29196]
+1 1 (3-3,2-2) panel-1-1 gTree[panel-1.gTree.29710]
+2 1 (3-3,6-6) panel-2-1 gTree[panel-2.gTree.29724]
3 3 (1-1,2-2) axis-t-1-1 zeroGrob[NULL]
4 3 (1-1,6-6) axis-t-2-1 zeroGrob[NULL]
-5 3 (4-4,2-2) axis-b-1-1 absoluteGrob[GRID.absoluteGrob.29199]
-6 3 (4-4,6-6) axis-b-2-1 absoluteGrob[GRID.absoluteGrob.29199]
+5 3 (4-4,2-2) axis-b-1-1 absoluteGrob[GRID.absoluteGrob.29727]
+6 3 (4-4,6-6) axis-b-2-1 absoluteGrob[GRID.absoluteGrob.29727]
7 3 (3-3,5-5) axis-l-1-2 zeroGrob[NULL]
-8 3 (3-3,1-1) axis-l-1-1 absoluteGrob[GRID.absoluteGrob.29205]
+8 3 (3-3,1-1) axis-l-1-1 absoluteGrob[GRID.absoluteGrob.29733]
9 3 (3-3,7-7) axis-r-1-2 zeroGrob[NULL]
10 3 (3-3,3-3) axis-r-1-1 zeroGrob[NULL]
11 2 (2-2,2-2) strip-t-1-1 gtable[strip]
12 2 (2-2,6-6) strip-t-2-1 gtable[strip]
-
+
$x
$x[[1]]
zeroGrob[NULL]
$x[[2]]
-titleGrob[axis.title.x.bottom..titleGrob.29255]
+titleGrob[axis.title.x.bottom..titleGrob.29783]
$y
$y[[1]]
-titleGrob[axis.title.y.left..titleGrob.29258]
+titleGrob[axis.title.y.left..titleGrob.29786]
$y[[2]]
zeroGrob[NULL]
18.3.1.1 Sneak peak:
The rest of the gtable step is just updating this plot_table
object.
-all_plot_table_versions <- ggtrace_inspect_vars(
- x = p, method = ggplot2:::ggplot_gtable.ggplot_built,
- at = "all", vars = "plot_table"
-)
-names(all_plot_table_versions)
+all_plot_table_versions <- ggtrace_inspect_vars(
+ x = p, method = ggplot2:::ggplot_gtable.ggplot_built,
+ at = "all", vars = "plot_table"
+)
+names(all_plot_table_versions)
[1] "Step8" "Step10" "Step22" "Step23" "Step24" "Step25" "Step26" "Step27"
[9] "Step28" "Step29" "Step30" "Step31" "Step32" "Step33" "Step34"
-lapply(seq_along(all_plot_table_versions), function(i) {
- ggsave(tempfile(sprintf("plot_table_%02d_", i), fileext = ".png"), all_plot_table_versions[[i]])
-})
-dir(tempdir(), "plot_table_.*png", full.names = TRUE) %>%
- magick::image_read() %>%
- magick::image_annotate(names(all_plot_table_versions), location = "+1050+0", size = 100) %>%
- magick::image_write_gif("images/plot_table_animation1.gif", delay = .5)
+lapply(seq_along(all_plot_table_versions), function(i) {
+ ggsave(tempfile(sprintf("plot_table_%02d_", i), fileext = ".png"), all_plot_table_versions[[i]])
+})
+dir(tempdir(), "plot_table_.*png", full.names = TRUE) %>%
+ magick::image_read() %>%
+ magick::image_annotate(names(all_plot_table_versions), location = "+1050+0", size = 100) %>%
+ magick::image_write_gif("images/plot_table_animation1.gif", delay = .5)
-
all_plot_table_versions2 <- ggtrace_inspect_vars(
- x = p +
- labs(
- subtitle = "This is a subtitle",
- caption = "@yjunechoe",
- tag = "A"
- )
- ,
- method = ggplot2:::ggplot_gtable.ggplot_built,
- at = "all", vars = "plot_table"
-)
-identical(names(all_plot_table_versions), names(all_plot_table_versions2))
-lapply(seq_along(all_plot_table_versions2), function(i) {
- ggsave(tempfile(sprintf("plot_table2_%02d_", i), fileext = ".png"), all_plot_table_versions2[[i]])
-})
-dir(tempdir(), "plot_table2_.*png", full.names = TRUE) %>%
- magick::image_read() %>%
- magick::image_annotate(names(all_plot_table_versions), location = "+1050+0", size = 100) %>%
- magick::image_write_gif("images/plot_table_animation2.gif", delay = .5)
+all_plot_table_versions2 <- ggtrace_inspect_vars(
+ x = p +
+ labs(
+ subtitle = "This is a subtitle",
+ caption = "@yjunechoe",
+ tag = "A"
+ )
+ ,
+ method = ggplot2:::ggplot_gtable.ggplot_built,
+ at = "all", vars = "plot_table"
+)
+identical(names(all_plot_table_versions), names(all_plot_table_versions2))
+lapply(seq_along(all_plot_table_versions2), function(i) {
+ ggsave(tempfile(sprintf("plot_table2_%02d_", i), fileext = ".png"), all_plot_table_versions2[[i]])
+})
+dir(tempdir(), "plot_table2_.*png", full.names = TRUE) %>%
+ magick::image_read() %>%
+ magick::image_annotate(names(all_plot_table_versions), location = "+1050+0", size = 100) %>%
+ magick::image_write_gif("images/plot_table_animation2.gif", delay = .5)
18.3.1.1 Sneak peak:
18.3.2 Adding guides
The legend (legend_box
) is first defined in Step 11:
-
+
title_height <- grobHeight(title)
-legend_box <- ggtrace_inspect_vars(
- x = p, method = ggplot2:::ggplot_gtable.ggplot_built,
- at = 12, vars = "legend_box"
-)
-grid.newpage()
-grid.draw(legend_box)
+legend_box <- ggtrace_inspect_vars(
+ x = p, method = ggplot2:::ggplot_gtable.ggplot_built,
+ at = 12, vars = "legend_box"
+)
+grid.newpage()
+grid.draw(legend_box)
It then undergoes some edits/tweaks, including resolving the legend.position
theme setting, and then finally gets added to the plot in Step 15:
-
+
caption_height <- grobHeight(caption)
-p_with_legend <- ggtrace_inspect_vars(
- x = p, method = ggplot2:::ggplot_gtable.ggplot_built,
- at = 16, vars = "plot_table"
-)
-grid.newpage()
-grid.draw(p_with_legend)
+p_with_legend <- ggtrace_inspect_vars(
+ x = p, method = ggplot2:::ggplot_gtable.ggplot_built,
+ at = 16, vars = "plot_table"
+)
+grid.newpage()
+grid.draw(p_with_legend)
The bulk of the work was done in Step 11, with the build_guides()
function. That in turn calls guides_train()
and guides_gengrob()
which in turn calls guide_train()
and guide_gengrob
for each scale (including positional aesthetics like x and y).
Why scale? The scale is actually what holds information about guide. They’re two sides of the same coin - the scale translates the underlying data to some defined space, and the guide reverses that (translates a space to data). One’s for drawing, the other is for reading.
This is also why all scale_*()
functions take a guide
argument. Positional scales use guide_axis()
as default, and non-positional scales use guide_legend()
as default.
-
+
[1] "GuideLegend" "Guide" "ggproto" "gg"
-
+
This is the output of the guide_train()
method defined for guide_legend()
. The most important piece of it is key
, which is the data associated with the legend.
-# TODO: The unexported function no longer exists, so we had to turn off eval.
-names( ggtrace_inspect_return(p, ggplot2:::guide_train.legend) )
-ggtrace_inspect_return(p, ggplot2:::guide_train.legend)$key
+# TODO: The unexported function no longer exists, so we had to turn off eval.
+names( ggtrace_inspect_return(p, ggplot2:::guide_train.legend) )
+ggtrace_inspect_return(p, ggplot2:::guide_train.legend)$key
The output of guide_train()
is passed to guide_gengrob()
. This is the output of the guide_gebgrob()
method defined for guide_legend()
:
-
+
@@ -871,41 +877,41 @@
18.3.3 Adding adornment
@@ -914,10 +920,10 @@ 18.3.3 Adding adornment
18.3.4 Output
To put it all together:
-
+
diff --git a/the-plot-method.html b/the-plot-method.html
index 7579d0e7..c1c85005 100644
--- a/the-plot-method.html
+++ b/the-plot-method.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -598,48 +604,48 @@
18.1 The plot()
method
The user-facing code and internal code is also separated by when they are evaluated. The user-facing code like geom_smooth()
is evaluated immediately to give you a ggplot object, but the internal code is only evaluated when a ggplot object is printed or plotted, via print()
and plot()
.
The following code simply creates a ggplot object from user-facing code, and DOES NOT print or plot the ggplot (yet).
-p <- ggplot(mpg, aes(displ, hwy, color = drv)) +
- geom_point(position = position_jitter(seed = 2022)) +
- geom_smooth(method = "lm", formula = y ~ x) +
- facet_wrap(vars(year)) +
- ggtitle("A plot for expository purposes")
+p <- ggplot(mpg, aes(displ, hwy, color = drv)) +
+ geom_point(position = position_jitter(seed = 2022)) +
+ geom_smooth(method = "lm", formula = y ~ x) +
+ facet_wrap(vars(year)) +
+ ggtitle("A plot for expository purposes")
The ggplot object is actually just a list under the hood:
-
+
[1] "gg" "ggplot"
-
+
[1] "list"
Evaluating the ggplot is what gives you the actual points, rectangles, text, etc. that make up the figure (and you can also do so explicitly with print()
/plot()
)
-
+
-
+
These are two separate processes, but we often think of them as one monolithic process:
-defining_benchmark <- bench::mark(
- # Evaluates user-facing code to define ggplot,
- # but does not call plot/print method
- p <- ggplot(mpg, aes(displ, hwy, color = drv)) +
- geom_point(position = position_jitter(seed = 2022)) +
- geom_smooth(method = "lm", formula = y ~ x) +
- facet_wrap(vars(year)) +
- ggtitle("A plot for expository purposes")
-)
-
-plotting_benchmark <- bench::mark(
- # Plots the ggplot
- plot(p)
-)
-
+defining_benchmark <- bench::mark(
+ # Evaluates user-facing code to define ggplot,
+ # but does not call plot/print method
+ p <- ggplot(mpg, aes(displ, hwy, color = drv)) +
+ geom_point(position = position_jitter(seed = 2022)) +
+ geom_smooth(method = "lm", formula = y ~ x) +
+ facet_wrap(vars(year)) +
+ ggtitle("A plot for expository purposes")
+)
+
+plotting_benchmark <- bench::mark(
+ # Plots the ggplot
+ plot(p)
+)
+
# A tibble: 2 × 4
min median `itr/sec` mem_alloc
<bch:tm> <bch:tm> <dbl> <bch:byt>
-1 3.06ms 3.34ms 295. 20.36KB
-2 231.52ms 233.04ms 4.29 3.51MB
+1 3.09ms 3.2ms 305. 20.36KB
+2 234.38ms 234.4ms 4.27 3.51MB
The plot that gets rendered from a ggplot object is actually a side effect of evaluating the ggplot object:
-
+
function (x, newpage = is.null(vp), vp = NULL, ...)
{
set_last_plot(x)
@@ -664,47 +670,47 @@ 18.1 The plot()
meth
}
invisible(x)
}
-<bytecode: 0x5567abddec40>
+<bytecode: 0x55d9296d2200>
<environment: namespace:ggplot2>
The above code can be simplified to this:
-ggprint <- function(x) {
- data <- ggplot_build(x)
- gtable <- ggplot_gtable(data)
- grid::grid.newpage()
- grid::grid.draw(gtable)
- return(invisible(x)) #< hence "side effect"
-}
-
-ggprint(p)
+ggprint <- function(x) {
+ data <- ggplot_build(x)
+ gtable <- ggplot_gtable(data)
+ grid::grid.newpage()
+ grid::grid.draw(gtable)
+ return(invisible(x)) #< hence "side effect"
+}
+
+ggprint(p)
Roughly put, you first start out as the ggplot object, which then gets passed to ggplot_build()
, result of which in turn gets passed to ggplot_gtable()
and finally drawn with {grid}
-library(grid)
-grid.newpage() # Clear display
-p %>%
- ggplot_build() %>% # 1. data for each layer is prepared for drawing
- ggplot_gtable() %>% # 2. drawing-ready data is turned into graphical elements
- grid.draw() # 3. graphical elements are converted to an image
+library(grid)
+grid.newpage() # Clear display
+p %>%
+ ggplot_build() %>% # 1. data for each layer is prepared for drawing
+ ggplot_gtable() %>% # 2. drawing-ready data is turned into graphical elements
+ grid.draw() # 3. graphical elements are converted to an image
At each step, you get closer to the low-level information you need to draw the actual plot
-obj_byte <- function(x) {
- scales::label_bytes()(as.numeric(object.size(x)))
-}
-
-# ggplot object
-p %>% obj_byte()
+obj_byte <- function(x) {
+ scales::label_bytes()(as.numeric(object.size(x)))
+}
+
+# ggplot object
+p %>% obj_byte()
[1] "32 kB"
-
+
[1] "102 kB"
-
+
[1] "684 kB"
-# the rendered plot
-ggsave(
- filename = tempfile(fileext = ".png"),
- plot = ggplot_gtable(ggplot_build(p)),
- # File size depends on format, dimension, resolution, etc.
-) %>% file.size() %>% {scales::label_bytes()(.)}
+# the rendered plot
+ggsave(
+ filename = tempfile(fileext = ".png"),
+ plot = ggplot_gtable(ggplot_build(p)),
+ # File size depends on format, dimension, resolution, etc.
+) %>% file.size() %>% {scales::label_bytes()(.)}
[1] "243 kB"
The rest of the chapter focuses what happens in this pipeine - the ggplot_build()
step and the ggplot_gtable()
step.
diff --git a/theme.html b/theme.html
index 054582f5..41097052 100644
--- a/theme.html
+++ b/theme.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -606,35 +612,35 @@
16.1 Theme
16.1.1 Complete themes
In ggplo2 there are preset themes ready to use:
-library(tidyverse)
-df <- data.frame(x = 1:3, y = 1:3)
-base <- ggplot(df, aes(x, y)) + geom_point()
-p1<-base + theme_grey() + ggtitle("theme_grey()")
-p2<-base + theme_bw() + ggtitle("theme_bw()")
-p3<-base + theme_linedraw() + ggtitle("theme_linedraw()")
-
-library(patchwork)
-p1+p2+p3
-
-p4<-base + theme_light() + ggtitle("theme_light()")
-p5<- base + theme_dark() + ggtitle("theme_dark()")
-p6<-base + theme_minimal() + ggtitle("theme_minimal()")
-
-p4+p5+p6
-
-p7<-base + theme_classic() + ggtitle("theme_classic()")
-p8<-base + theme_void() + ggtitle("theme_void()")
-
-p7+p8
-
+library(tidyverse)
+df <- data.frame(x = 1:3, y = 1:3)
+base <- ggplot(df, aes(x, y)) + geom_point()
+p1<-base + theme_grey() + ggtitle("theme_grey()")
+p2<-base + theme_bw() + ggtitle("theme_bw()")
+p3<-base + theme_linedraw() + ggtitle("theme_linedraw()")
+
+library(patchwork)
+p1+p2+p3
+
+p4<-base + theme_light() + ggtitle("theme_light()")
+p5<- base + theme_dark() + ggtitle("theme_dark()")
+p6<-base + theme_minimal() + ggtitle("theme_minimal()")
+
+p4+p5+p6
+
+p7<-base + theme_classic() + ggtitle("theme_classic()")
+p8<-base + theme_void() + ggtitle("theme_void()")
+
+p7+p8
+
Or, you can use other packages such as {ggthemes} or other here: ggplot extension gallery
-library(ggthemes)
-p9<-base + theme_tufte() + ggtitle("theme_tufte()")
-p10<-base + theme_solarized() + ggtitle("theme_solarized()")
-p11<-base + theme_excel() + ggtitle("theme_excel()")
-
-p9+p10+p11
-
+library(ggthemes)
+p9<-base + theme_tufte() + ggtitle("theme_tufte()")
+p10<-base + theme_solarized() + ggtitle("theme_solarized()")
+p11<-base + theme_excel() + ggtitle("theme_excel()")
+
+p9+p10+p11
+
- Modifying complete theme components with
theme()
function
diff --git a/themes-1.html b/themes-1.html
index fe86276b..2bc20040 100644
--- a/themes-1.html
+++ b/themes-1.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -603,7 +609,7 @@
19.2 Themes
-
+
function (base_size = 11, base_family = "", base_line_size = base_size/22,
base_rect_size = base_size/22)
{
@@ -616,9 +622,9 @@ 19.2 Themesprint(theme_minimal)
+
function (base_size = 11, base_family = "", base_line_size = base_size/22,
base_rect_size = base_size/22)
{
@@ -629,7 +635,7 @@ 19.2 Themes
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/theory-of-scales-and-guides.html b/theory-of-scales-and-guides.html
index 08d212d1..75d96afb 100644
--- a/theory-of-scales-and-guides.html
+++ b/theory-of-scales-and-guides.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -636,33 +642,33 @@
13.1 Theory of scales and guides<
13.1.1 Scale specification
An important property of ggplot2 is the principle that every aesthetic in your plot is associated with exactly one scale. For instance, when you write this
-
+
ggplot2 adds a default scale for each aesthetic used in the plot:
-ggplot(mpg, aes(displ, hwy)) +
- geom_point(aes(colour = class)) +
- scale_x_continuous() +
- scale_y_continuous() +
- scale_colour_discrete()
-ggplot(mpg, aes(displ, hwy)) +
- geom_point(aes(colour = class)) +
- scale_x_continuous(name = "A really awesome x axis label") +
- scale_y_continuous(name = "An amazingly great y axis label")
+ggplot(mpg, aes(displ, hwy)) +
+ geom_point(aes(colour = class)) +
+ scale_x_continuous() +
+ scale_y_continuous() +
+ scale_colour_discrete()
+ggplot(mpg, aes(displ, hwy)) +
+ geom_point(aes(colour = class)) +
+ scale_x_continuous(name = "A really awesome x axis label") +
+ scale_y_continuous(name = "An amazingly great y axis label")
The use of +
to “add” scales to a plot is a little misleading because if you supply two scales for the same aesthetic, the last scale takes precedence:
-ggplot(mpg, aes(displ, hwy)) +
- geom_point() +
- scale_x_continuous(name = "Label 1") +
- scale_x_continuous(name = "Label 2")
-#> Scale for 'x' is already present. Adding another scale for 'x', which will
-#> replace the existing scale.
-
-ggplot(mpg, aes(displ, hwy)) +
- geom_point() +
- scale_x_continuous(name = "Label 2")
-ggplot(mpg, aes(displ, hwy)) +
- geom_point(aes(colour = class)) +
- scale_x_sqrt() +
- scale_colour_brewer()
+ggplot(mpg, aes(displ, hwy)) +
+ geom_point() +
+ scale_x_continuous(name = "Label 1") +
+ scale_x_continuous(name = "Label 2")
+#> Scale for 'x' is already present. Adding another scale for 'x', which will
+#> replace the existing scale.
+
+ggplot(mpg, aes(displ, hwy)) +
+ geom_point() +
+ scale_x_continuous(name = "Label 2")
+
13.1.2 Naming scheme
diff --git a/use-components-annotation-and-additional-arguments-in-a-plot.html b/use-components-annotation-and-additional-arguments-in-a-plot.html
index ed54e6b9..c392a620 100644
--- a/use-components-annotation-and-additional-arguments-in-a-plot.html
+++ b/use-components-annotation-and-additional-arguments-in-a-plot.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -601,51 +607,51 @@
17.2 Use components, annotation,
“A quick and dirty way to get map data (from the maps package) on to your plot.”
-
-
-
+borders <- function(database = "world", regions = ".", fill = NA,
+ colour = "grey50", ...) {
+ df <- map_data(database, regions)
+ geom_polygon(
+ aes_(~long, ~lat, group = ~group),
+ data = df, fill = fill, colour = colour, ...,
+ inherit.aes = FALSE, show.legend = FALSE
+ )
+}
+library(maps)
+data(us.cities)
+capitals <- subset(us.cities, capital == 2)
+
+ggplot(capitals, aes(long, lat)) +
+ borders("world", xlim = c(-130, -60), ylim = c(20, 50)) +
+ geom_point(aes(size = pop)) +
+ scale_size_area() +
+ coord_quickmap()
+
We can even add addtional arguments, such as those ones to modify and add things:
modifyList()
do.call()
-geom_mean <- function(..., bar.params = list(), errorbar.params = list()) {
- params <- list(...)
- bar.params <- modifyList(params, bar.params)
- errorbar.params <- modifyList(params, errorbar.params)
-
- bar <- do.call("stat_summary", modifyList(
- list(fun = "mean", geom = "bar", fill = "grey70"),
- bar.params)
- )
- errorbar <- do.call("stat_summary", modifyList(
- list(fun.data = "mean_cl_normal", geom = "errorbar", width = 0.4),
- errorbar.params)
- )
-
- list(bar, errorbar)
-}
+geom_mean <- function(..., bar.params = list(), errorbar.params = list()) {
+ params <- list(...)
+ bar.params <- modifyList(params, bar.params)
+ errorbar.params <- modifyList(params, errorbar.params)
+
+ bar <- do.call("stat_summary", modifyList(
+ list(fun = "mean", geom = "bar", fill = "grey70"),
+ bar.params)
+ )
+ errorbar <- do.call("stat_summary", modifyList(
+ list(fun.data = "mean_cl_normal", geom = "errorbar", width = 0.4),
+ errorbar.params)
+ )
+
+ list(bar, errorbar)
+}
And here is the result:
-
-
+
+
diff --git a/visualizing-networks.html b/visualizing-networks.html
index eb226456..a5bb6bca 100644
--- a/visualizing-networks.html
+++ b/visualizing-networks.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -612,54 +618,54 @@
6.3.1.1 Specifying a layoutAs an example we take the data(highschool, package = "ggraph")
and make a visualization of the graph:
hs_graph <- tidygraph::as_tbl_graph(highschool,
directed = FALSE)
-
+
A second example is with more features:
-hs_graph <- hs_graph %>%
- tidygraph::activate(edges) %>%
- mutate(edge_weights = runif(n()))
-
-
-ggraph(hs_graph, layout = "stress", weights = edge_weights) +
- geom_edge_link(aes(alpha = edge_weights)) +
- geom_node_point() +
- scale_edge_alpha_identity()
+hs_graph <- hs_graph %>%
+ tidygraph::activate(edges) %>%
+ mutate(edge_weights = runif(n()))
+
+
+ggraph(hs_graph, layout = "stress", weights = edge_weights) +
+ geom_edge_link(aes(alpha = edge_weights)) +
+ geom_node_point() +
+ scale_edge_alpha_identity()
In the following examples we see different layouts.
Information about “drl” type of layout: DRL force-directed graph layout, an be found in the igraph package.
-layout <- ggraph::create_layout(hs_graph, layout = 'drl')
-
-ggraph(layout) +
- geom_edge_link() +
- geom_node_point()
+layout <- ggraph::create_layout(hs_graph, layout = 'drl')
+
+ggraph(layout) +
+ geom_edge_link() +
+ geom_node_point()
Instead of {tidygraph} we use {igraph}, with layout = “kk”: layout.kamada.kawai
-require(ggraph)
-require(igraph)
-
-hs_graph2 <- igraph::graph_from_data_frame(highschool)
-
-layout <- create_layout(hs_graph2, layout = "kk")
-
-ggraph(layout) +
- geom_edge_link(aes(colour = factor(year))) +
- geom_node_point()
+require(ggraph)
+require(igraph)
+
+hs_graph2 <- igraph::graph_from_data_frame(highschool)
+
+layout <- create_layout(hs_graph2, layout = "kk")
+
+ggraph(layout) +
+ geom_edge_link(aes(colour = factor(year))) +
+ geom_node_point()
A very simple example to understand how to make a graph network is from this tutorial: Networks in igraph
To understand a bit more about the graph structure we can use these functions:
-
-## + 3/3 edges from 5c39927:
+
+## + 3/3 edges from 43e43ed:
## [1] 1--2 2--3 1--3
-
-## + 3/3 vertices, from 5c39927:
+
+## + 3/3 vertices, from 43e43ed:
## [1] 1 2 3
-
+
## 3 x 3 sparse Matrix of class "dgCMatrix"
##
## [1,] . 1 1
@@ -670,14 +676,14 @@ 6.3.1.1 Specifying a layout6.3.1.2 Circularity
Layouts can be linear and circular.
coord_polar() changes the coordinate system and not affect the edges
-
+
-ggraph(luv_graph, layout = 'dendrogram') +
- geom_edge_link() +
- coord_polar() +
- scale_y_reverse()
+ggraph(luv_graph, layout = 'dendrogram') +
+ geom_edge_link() +
+ coord_polar() +
+ scale_y_reverse()
@@ -691,10 +697,10 @@ 6.3.2 Drawing nodesGetting Started guide to nodes
-
+
More features could be added to calculate node and edge centrality, such as:
@@ -702,13 +708,13 @@ 6.3.2 Drawing nodesggraph(luv_graph, layout = "stress") +
- geom_edge_link() +
- geom_node_point(aes(colour =centrality_power()))
+ggraph(luv_graph, layout = "stress") +
+ geom_edge_link() +
+ geom_node_point(aes(colour =centrality_power()))
Or making tiles:
-
+
@@ -727,44 +733,44 @@ 6.3.3 Drawing edgesGetting Started guide to edges
The after_stat(index)
:
-set.seed(123)
-ggraph(hs_graph, layout = "stress") +
- geom_edge_link(aes(alpha = after_stat(index)))
+set.seed(123)
+ggraph(hs_graph, layout = "stress") +
+ geom_edge_link(aes(alpha = after_stat(index)))
Here is an example about how to use node.class variable
, the graph is the first that we have seen and it is artificially made with:
tidygraph::play_erdos_renyi()
-graph <- tidygraph::play_erdos_renyi(n = 10, p = 0.2) %>%
- activate(nodes) %>%
- mutate(class = sample(letters[1:4],
- n(), replace = TRUE)) %>%
- activate(edges) %>%
- arrange(.N()$class[from])
-
-
-ggraph(graph, layout = "stress") +
- geom_edge_link2(
- aes(colour = node.class),
- width = 3,
- lineend = "round")
+graph <- tidygraph::play_erdos_renyi(n = 10, p = 0.2) %>%
+ activate(nodes) %>%
+ mutate(class = sample(letters[1:4],
+ n(), replace = TRUE)) %>%
+ activate(edges) %>%
+ arrange(.N()$class[from])
+
+
+ggraph(graph, layout = "stress") +
+ geom_edge_link2(
+ aes(colour = node.class),
+ width = 3,
+ lineend = "round")
-
+
Trees and specifically dendrograms:
-
+
@@ -785,10 +791,10 @@ 6.3.4 Facetingggraph(hs_graph, layout = "stress") +
- geom_edge_link() +
- geom_node_point() +
- facet_edges(~year)
+ggraph(hs_graph, layout = "stress") +
+ geom_edge_link() +
+ geom_node_point() +
+ facet_edges(~year)
diff --git a/weighted-data.html b/weighted-data.html
index 991b6b0f..3444185f 100644
--- a/weighted-data.html
+++ b/weighted-data.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -597,46 +603,46 @@
4.3 Weighted Data
If each row of your dataframe contains multiple observations, we can use a weight to visually give scale to observations
-
+
-# Weight by population
-ggplot(midwest, aes(percwhite, percbelowpoverty)) +
- geom_point(aes(size = poptotal / 1e6)) +
- scale_size_area("Population\n(millions)", breaks = c(0.5, 1, 2, 4))
+# Weight by population
+ggplot(midwest, aes(percwhite, percbelowpoverty)) +
+ geom_point(aes(size = poptotal / 1e6)) +
+ scale_size_area("Population\n(millions)", breaks = c(0.5, 1, 2, 4))
-# Unweighted
-ggplot(midwest, aes(percwhite, percbelowpoverty)) +
- geom_point() +
- geom_smooth(method = lm, size = 1)
+# Unweighted
+ggplot(midwest, aes(percwhite, percbelowpoverty)) +
+ geom_point() +
+ geom_smooth(method = lm, size = 1)
## `geom_smooth()` using formula = 'y ~ x'
-# Weighted by population
-ggplot(midwest, aes(percwhite, percbelowpoverty)) +
- geom_point(aes(size = poptotal / 1e6)) +
- geom_smooth(aes(weight = poptotal), method = lm, size = 1) +
- scale_size_area(guide = "none")
+# Weighted by population
+ggplot(midwest, aes(percwhite, percbelowpoverty)) +
+ geom_point(aes(size = poptotal / 1e6)) +
+ geom_smooth(aes(weight = poptotal), method = lm, size = 1) +
+ scale_size_area(guide = "none")
## `geom_smooth()` using formula = 'y ~ x'
-
+
-ggplot(midwest, aes(percbelowpoverty)) +
- geom_histogram(aes(weight = poptotal), binwidth = 1) +
- ylab("Population (1000s)")
+ggplot(midwest, aes(percbelowpoverty)) +
+ geom_histogram(aes(weight = poptotal), binwidth = 1) +
+ ylab("Population (1000s)")
Question for the group: Is the above ylab
correct? Check out the next two figures, can you see the difference?
-ggplot(midwest, aes(percbelowpoverty)) +
- geom_histogram(aes(weight = poptotal/1e3), binwidth = 1) +
- ylab("Population (1000s)")
+ggplot(midwest, aes(percbelowpoverty)) +
+ geom_histogram(aes(weight = poptotal/1e3), binwidth = 1) +
+ ylab("Population (1000s)")
-
+
diff --git a/welcome-to-ggplot2.html b/welcome-to-ggplot2.html
index fa6dc80e..378ed527 100644
--- a/welcome-to-ggplot2.html
+++ b/welcome-to-ggplot2.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
diff --git a/what-is-network-data.html b/what-is-network-data.html
index dfe7f863..43b8402c 100644
--- a/what-is-network-data.html
+++ b/what-is-network-data.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
+
- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+- 2.11 Meeting Videos
+
- 3 Collective Geoms
@@ -609,50 +615,50 @@
6.2.1 A tidy network manipulation
In this example we create a graph, assign a random label to the nodes, and sort the edges based on the label of their source node.
The function play_erdos_renyi()
creates graphs directly through sampling of different attributes.
-library(tidygraph)
-
-graph <- tidygraph::play_erdos_renyi(n = 10, p = 0.2) %>%
- activate(nodes) %>%
- mutate(class = sample(letters[1:4], n(), replace = TRUE)) %>%
- activate(edges) %>%
- arrange(.N()$class[from])
-
-graph
+library(tidygraph)
+
+graph <- tidygraph::play_erdos_renyi(n = 10, p = 0.2) %>%
+ activate(nodes) %>%
+ mutate(class = sample(letters[1:4], n(), replace = TRUE)) %>%
+ activate(edges) %>%
+ arrange(.N()$class[from])
+
+graph
## # A tbl_graph: 10 nodes and 14 edges
## #
-## # A directed simple graph with 3 components
+## # A directed simple graph with 1 component
## #
## # Edge Data: 14 × 2 (active)
## from to
## <int> <int>
-## 1 1 10
-## 2 2 1
-## 3 2 3
-## 4 4 10
-## 5 10 4
-## 6 2 5
-## 7 2 8
-## 8 9 2
-## 9 9 4
-## 10 9 10
-## 11 3 2
-## 12 3 10
-## 13 5 8
-## 14 8 9
+## 1 4 5
+## 2 9 2
+## 3 10 3
+## 4 2 5
+## 5 2 9
+## 6 7 1
+## 7 3 2
+## 8 8 3
+## 9 8 4
+## 10 3 5
+## 11 6 10
+## 12 8 6
+## 13 7 8
+## 14 8 10
## #
## # Node Data: 10 × 1
## class
## <chr>
## 1 a
-## 2 a
-## 3 c
+## 2 c
+## 3 d
## # ℹ 7 more rows
6.2.2 Conversion
Data can be converted with as_tbl_graph()
, a data structure for tidy graph manipulation. It converts a data frame encoded as an edgelist, as well as converting the result of hclust()
-
+
## from to year
## 1 1 14 1957
## 2 1 15 1957
@@ -661,8 +667,8 @@ 6.2.2 Conversionhs_graph <- tidygraph::as_tbl_graph(highschool, directed = FALSE)
-hs_graph
+
## # A tbl_graph: 70 nodes and 506 edges
## #
## # An undirected multigraph with 1 component
@@ -680,10 +686,10 @@ 6.2.2 Conversion6.2.2.1 hclust() and dist() functions:
In this example the luv_colours()
function allows for all built-in colors()
translated into Luv colour space, a data frame with 657 observations and 4 variables:
luv_colours
-luv_colours <- as.data.frame(convertColor(t(col2rgb(colors())),
- "sRGB", "Luv"))
-luv_colours$col <- colors()
-head(luv_colours)
+luv_colours <- as.data.frame(convertColor(t(col2rgb(colors())),
+ "sRGB", "Luv"))
+luv_colours$col <- colors()
+head(luv_colours)
## L u v col
## 1 9341.570 -3.370649e-12 0.0000 white
## 2 9100.962 -4.749170e+02 -635.3502 aliceblue
@@ -692,14 +698,14 @@ 6.2.2.1 hclust() and dist() funct
## 5 8452.499 1.014911e+03 1609.5923 antiquewhite2
## 6 7498.378 9.029892e+02 1401.7026 antiquewhite3
This visualization represent the content of the dataset, then we will see how it looks in a grapg representation.
-ggplot(luv_colours, aes(u, v)) +
-geom_point(aes(colour = col), size = 3) +
-scale_color_identity() +
-coord_equal() +
- theme_void()
+ggplot(luv_colours, aes(u, v)) +
+geom_point(aes(colour = col), size = 3) +
+scale_color_identity() +
+coord_equal() +
+ theme_void()
For example, selecting the first 3 variables and plotting the data with the plot() function we can see that there are some connections within the elements of the dataset, as the colors are connected to each other.
-
+
## L u v
## 1 9341.570 -3.370649e-12 0.0000
## 2 9100.962 -4.749170e+02 -635.3502
@@ -707,14 +713,14 @@ 6.2.2.1 hclust() and dist() funct
## 4 8935.225 1.065698e+03 1674.5948
## 5 8452.499 1.014911e+03 1609.5923
## 6 7498.378 9.029892e+02 1401.7026
-
+
-
-
+
+
## [1] "hclust"
With the tidygraph::as_tbl_graph()
function we can transorm the dataset into classes “tbl_graph”, “igraph” to make it ready to use for making a visualization of the network data.
-
+
## # A tbl_graph: 1313 nodes and 1312 edges
## #
## # A rooted tree
@@ -747,10 +753,10 @@ 6.2.2.1 hclust() and dist() funct
6.2.3 Algorithms
The real benefit of networks comes from the different operations that can be performed on them using the underlying structure.
-luv_graph %>%
- tidygraph::activate(nodes) %>%
- mutate(centrality = centrality_pagerank()) %>%
- arrange(desc(centrality))
+luv_graph %>%
+ tidygraph::activate(nodes) %>%
+ mutate(centrality = centrality_pagerank()) %>%
+ arrange(desc(centrality))
## # A tbl_graph: 1313 nodes and 1312 edges
## #
## # A rooted tree
diff --git a/working-with-sf-data.html b/working-with-sf-data.html
index 06f43bfa..4f534b21 100644
--- a/working-with-sf-data.html
+++ b/working-with-sf-data.html
@@ -23,7 +23,7 @@
-
+
@@ -188,18 +188,24 @@
- Layers
- 2 Individual Geoms
-- 2.1 Scatterplot:
-- 2.2 Line plot:
-- 2.3 Histogram:
-- 2.4 Bar chart
-- 2.5 geom_polygon() draws polygons which are filled paths.
-- 2.6 geom_line() connects points from left to right.
-- 2.7 What low-level geoms are used to draw geom_smooth()?
-- 2.8 What low-level geoms are used to draw geom_boxplot()?
-- 2.9 What low-level geoms are used to draw geom_violin()?
-- 2.10 Meeting Videos
-
-- 2.10.1 Cohort 1
+- 2.1 The basics
+- 2.2 Area chart:
geom_area()
+- 2.3 Bar chart:
geom_bar()
+- 2.4 Line chart:
geom_line()
+- 2.5 Scatterplot:
geom_point()
+- 2.6 Polygons:
geom_polygon()
+- 2.7 Histograms:
geom_histogram()
+- 2.8 Drawing rectangles:
geom_rect()
; geom_tile()
; geom_raster()
+- 2.9 Add text to a plot:
geom_text()
+- 2.10 Exercise solutions
+
+- 2.11 Meeting Videos
+
- 3 Collective Geoms