-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy path04_quarto.qmd
477 lines (326 loc) · 15.1 KB
/
04_quarto.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
---
title: "Quarto"
abstract: "This section introduces the literate statistical programming tool Quarto."
format:
html:
code-line-numbers: true
fig-align: "center"
editor_options:
chunk_output_type: console
bibliography: references.bib
---
![A page from Alexander Graham Bell's laboratory notebook](images/AGBell_Notebook.jpg)
```{r hidden-here-load}
#| include: false
exercise_number <- 1
```
```{r}
#| echo: false
#| warning: false
library(tidyverse)
library(gt)
source("src/motivation.R")
```
```{r}
#| label: tbl-roadmap
#| tbl-cap: "Opinionated Analysis Development"
#| echo: false
motivation |>
filter(!is.na(Section), Section == "Literate Programming") |>
select(-`Analysis Feature`) |>
arrange(Section) |>
gt() |>
tab_header(
title = "Opinionated Analysis Development"
) |>
tab_footnote(
footnote = "Added by Aaron R. Williams",
locations = cells_column_labels(columns = c(Tool, Section))
) |>
tab_source_note(
source_note = md("**Source:** Parker, Hilary. n.d. “Opinionated Analysis Development.” https://doi.org/10.7287/peerj.preprints.3210v1.")
)
```
## Literate (Statistical) Programming
![**Source:** [Jacob Applebaum](https://en.wikipedia.org/wiki/Donald_Knuth#/media/File:KnuthAtOpenContentAlliance.jpg)](images/knuth.jpg){width="150"}
Donald Knuth, the creator of TeX and LaTeX, is a founder of literate programming. Here is Donald Knuth's motivation according to Donald Knuth:
> Let us change our traditional attitude to the construction of programs: Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to human beings what we want a computer to do. \~ [@knuth1984]
### Example
We used a linear model because there is reason to believe that the population model is linear. The observations are independent and the errors are independently and identically distributed with an approximately normal distribution.
```{r}
#| label: linear-model
model1 <- lm(formula = dist ~ speed, data = cars)
model1
```
An increase in travel speed of one mile per hour is associated with a `r round(model1$coefficients[2], 2)` foot increase in stopping distance on average.
## Quarto
### Why Quarto
Quarto is a literate statistical programming tool for R, Julia, Python, JavaScript, and more that was released by [Posit](https://posit.co) in 2022. Quarto is an important tool for reproducible research. It combines narrative text with styles, code, and the output of code and can be used to create many types of documents including PDFs, html websites, slides, and more.
Quarto builds on the success of R Markdown. In fact, Quarto will Render R Markdown (`.Rmd`) documents without any edits or changes.
Jupyter (**Ju**lia, **Py**thon, and **R**) is a competing framework that is popular for Python but has not caught on for R.
According to @wickham2016 [Chapter 27](https://r4ds.had.co.nz/r-markdown.html#introduction-18), there are three main reasons to use R Markdown (they hold for Quarto) :
> 1. "For communicating to decision makers, who want to focus on the conclusions, not the code behind the analysis."
> 2. "For collaborating with other data scientists (including future you!), who are interested in both your conclusions, and how you reached them (i.e. the code)."
> 3. "As an environment in which to do data science, as a modern day lab notebook where you can capture not only what you did, but also what you were thinking."
### Quarto Process
Quarto uses
- plain text files ending in `.qmd` that are similar to `.R` files.
- `library(knitr)`.
- [pandoc](https://pandoc.org/).[^11_quarto-1]
[^11_quarto-1]: Pandoc is free software that converts documents between markup formats. For example, Pandoc can convert files to and from markdown, LaTeX, jupyter notebook (.ipynb), and Microsoft Word (.docx) formats, among many others. You can see a comprehensive list of files Pandoc can convert on their [About Page](https://pandoc.org/index.html).
`.qmd` files can be used interactively and they can be used as executables by clicking Render. Clicking the "Render" button starts the rendering process.
![](images/render.png)
1. Quarto opens a fresh R session and runs all code from scratch. Quarto calls `library(knitr)` and "knits" `.qmd` (Quarto files) into `.md` (Markdown files).
2. Pandoc converts the `.md` file into any specified output type.
Quarto and `library(knitr)` don't need to be explicitly loaded and most of the magic happens "under the hood."
![](images/rstudio-qmd-how-it-works.png)
**Source:** [Quarto website](https://quarto.org/docs/get-started/hello/rstudio.html)
Quarto, `library(knitr)`, and Pandoc are all installed with RStudio.[^11_quarto-2]
[^11_quarto-2]: Rendering to PDF requires a LaTeX distribution. Follow [these instructions](https://yihui.org/tinytex/) to install `library(tinytex)` if you want to make PDF documents.
The "Render" workflow has a few advantages:
1. All code is rerun in a clean environment when "Rendering". This ensures that the code runs in order and is reproducible.
2. It is easier to document code than with inline comments.
3. The output types are really appealing. By creating publishable documents with code, there is no need to copy-and-paste or transpose results.
4. The process is iterable and scalable.
::: callout
#### [`r paste("Exercise", exercise_number)`]{style="color:#1696d2;"}
```{r}
#| echo: false
exercise_number <- exercise_number + 1
```
1. Click the new script button and add a "Quarto Document".
2. Give the document a name, an author, and ensure that HTML is selected.
3. Save the document as "analysis.qmd" in your `example-project/` project.
4. Click "Render".
:::
## Three Ingredients in a `.qmd`
1. YAML header
2. Markdown text
3. Code chunks
### 1. YAML header
YAML stands for "yet another markup language." The YAML header contains meta information about the document including output type, document settings, and parameters that can be passed to the document. The YAML header starts with `---` and ends with `---`.
Here is the simplest YAML header for a PDF document:
```
---
format: html
---
```
YAML headers can contain many output specific settings. This YAML header creates an HTML document with code folding and a floating table of contents:
```
---
format:
html:
code-fold: true
toc: true
---
```
::: callout
#### [`r paste("Exercise", exercise_number)`]{style="color:#1696d2;"}
```{r}
#| echo: false
exercise_number <- exercise_number + 1
```
1. Add `embed-sources: true` to your YAML header.
2. Update the `html` line to look like:
```
---
embed-resources: true
format:
html:
code-fold: true
---
```
:::
### 2. Markdown text
Markdown is a shortcut for Hyper Text Markup Language (HTML). Essentially, simple meta characters corresponding to formatting are added to plain text.
```
Titles and subtitltes
------------------------------------------------------------
# Title 1
## Title 2
### Title 3
Text formatting
------------------------------------------------------------
*italic*
**bold**
`code`
Lists
------------------------------------------------------------
- Bulleted list item 1
- Item 2
- Item 2a
- Item 2b
1. Item 1
2. Item 2
Links and images
------------------------------------------------------------
[text](http://link.com)
![Image caption](images/image.png)
```
::: callout
#### [`r paste("Exercise", exercise_number)`]{style="color:#1696d2;"}
```{r}
#| echo: false
exercise_number <- exercise_number + 1
```
1. Add text with formatting like headers and bold to your Quarto document.
2. Render!
:::
### 3. Code chunks
![](images/inline-r-code.png)
More frequently, code is added in code chunks:
```{r}
#| echo: fenced
2 + 2
```
The first argument inline or in a code chunk is the language engine. Most commonly, this will just be a lower case `r`. `knitr` allows for many different language engines:
- R
- Julia
- Python
- SQL
- Bash
- Rcpp
- Stan
- Javascript
- CSS
Quarto has a rich set of options that go inside of the chunks and control the behavior of Quarto.
```{r}
#| echo: fenced
#| eval: false
2 + 2
```
In this case, `eval` makes the code not run. Other chunk-specific settings can be added inside the brackets. Here[^11_quarto-3] are the most important options:
[^11_quarto-3]: This table was typed as Markdown code. But sometimes it is easier to use a code chunk to create and print a table. Pipe any data frame into `knitr::kable()` to create a table that will be formatted in the output of a rendered Quarto document.
| Option | Effect |
|------------------|-----------------------------------------------|
| `echo: false` | Hides code in output |
| `eval: false` | Turns off evaluation |
| `output: false` | Hides code output |
| `warning: false` | Turns off warnings |
| `message: false` | Turns off messages |
| `fig-height: 8` | Changes figure width in inches[^11_quarto] |
| `fig-width: 8` | Changes figure height in inches[^11_quarto] |
[^11_quarto]: The default dimensions for figures change based on the output format. Visit [here](https://quarto.org/docs/computations/execution-options.html#figure-options) to learn more.
::: callout
#### [`r paste("Exercise", exercise_number)`]{style="color:#1696d2;"}
```{r}
#| echo: false
exercise_number <- exercise_number + 1
```
1. Add a code chunk.
2. Load `library(tidyverse)`.
3. Filter the built-in data set `storms` to only include hurricanes.[^r-code]
4. Make a data visualization with ggplot2 using the data from `storms`.
5. Include an option to hide the R code.
6. Render!
:::
[^r-code]: `library(tidyverse)` isn't the focus of today. Feel free to write 2 + 2.
## Organizing a Quarto Document
It is important to clearly organize a Quarto document and the constellation of files that typically support an analysis.
1. Always use `.Rproj` files.
2. Use sub-directories to sort images, `.css`, data.
## Outputs
The [Quarto gallery](https://quarto.org/docs/gallery/) contains many example outputs.
### HTML Notebooks and Reports
::: callout
.html Quarto is an ideal format for notebooks and reports. It has flexible functionality and it avoids page breaks, which are a momentum kill.
:::
```
---
format: html
---
```
- [A Quarto Page Layout Example](https://quarto-dev.github.io/quarto-gallery/page-layout/tufte.html)
### PDF Notebooks, Reports, and Journal Articles
::: callout
.pdf Quarto is the ideal format for printing. Several major academic journals provide [templates](https://github.com/quarto-journals/) for publishing directly from Quarto.
:::
```
---
format: pdf
---
```
- [This blog](https://cameronpatrick.com/post/2023/07/quarto-thesis-formatting/) details how Cameron Patrick wrote their PhD thesis in Quarto.
### MS Word Reports
::: callout
.docx Quarto works but lacks functionality for precise formatting. [This page](https://quarto.org/docs/reference/formats/docx.html) outlines the formatting options.
:::
```
---
format: docx
---
```
### Presentations
::: callout
Quarto can render awesome slide decks using RevealJS, Beamer, and PowerPoint.
:::
------------------------------------------------------------------------
```
format:
revealjs:
css: styles.css
incremental: true
reveal_options:
slideNumber: true
previewLinks: true
---
```
- [Pop Songs and Political Science](https://quarto-dev.github.io/quarto-gallery/presentations/beamer/beamer.pdf)
- [R + Quarto: How we developed a pipeline to create \>3500 html factsheets](https://github.com/Deckart2/mm-presentation)
### Websites
::: callout
Quarto is excellent for building websites from multiple .qmd documents. It is easy to host these websites for free using GitHub pages.
:::
- [R at the Urban Institute website](https://urbaninstitute.github.io/r-at-urban/)
- [Reproducibility at Urban](https://ui-research.github.io/reproducibility-at-urban/)
### Books
::: callout-block
[Bookdown](https://bookdown.org/) is an R package by Yihui Xie for authoring books in R Markdown. Many books, including the first edition of *R for Data Science* [@wickham2016], have been written in Quarto.
[Quarto book](https://quarto.org/docs/books/) replaces bookdown. It is oriented around Quarto projects. The second edition of *R for Data Science* [@wickham2023] was written in Quarto.
:::
- [Data Science for Public Policy](https://awunderground.github.io/data-science-for-public-policy2/)
### GitHub README
::: callout
Quarto can render .md documents that work well as READMEs for GitHub and GitLab. This is useful for including code and code output in the README.
:::
```
---
format: gfm
---
```
- [urbnthemes](https://github.com/UrbanInstitute/urbnthemes)
- [syntheval](https://github.com/UrbanInstitute/syntheval)
### Fact Sheets and Fact Pages
An alternative to rendering a Quarto document with the Render button is to use the `quarto::render()` function. This allows for iterating the rendering of documents which is particularly useful for the development of fact sheets and fact pages. The next chapter of the book expands on this use case.
- The Urban Institute State and Local Finance Initiative creates [State Fiscal Briefs](https://www.urban.org/policy-centers/cross-center-initiatives/state-and-local-finance-initiative/projects/state-fiscal-briefs) by iterating R Markdown documents.
- [Data\@Urban](https://medium.com/@urban_institute/iterated-fact-sheets-with-r-markdown-d685eb4eafce)
::: callout
#### [`r paste("Exercise", exercise_number)`]{style="color:#1696d2;"}
```{r}
#| echo: false
exercise_number <- exercise_number + 1
```
Rendering a Quarto document to PDF requires a LaTeX distribution.
1. If you already have a LaTeX distribution like `tinytext` or `MiKTeX`, then skip this exercise.
2. Follow [these instructions](https://yihui.org/tinytex/) to install `library(tinytex)`.
:::
::: callout
#### [`r paste("Exercise", exercise_number)`]{style="color:#1696d2;"}
```{r}
#| echo: false
exercise_number <- exercise_number + 1
```
1. Start a new .qmd document. During the prompt, select PDF output instead of HTML output.
2. Create a basic .qmd document called `analysis.qmd`.
3, Render the PDF.
:::
## Conclusion
Quarto is an updated version of R Markdown that can handle not only R but also Python and Julia code (among other languages). Quarto combines a yaml header, markdown text, and code chunks. It can be used in a variety of settings to create technical documents and presentations. We love Quarto and hope you will learn to love it too!
### Suggestions
- Render early, and render often.
- Select the gear to the right of "Render" and select "Chunk Output in Console"
- Learn math mode. Also, `library(equatiomatic)` ([CRAN](https://cran.r-project.org/web/packages/equatiomatic/index.html), [GitHub](https://github.com/datalorax/equatiomatic)) is amazing.
### Resources
- [Quarto intro](https://quarto.org/)
- [R4DS R Quarto chapter](https://r4ds.hadley.nz/quarto)
- [Happy Git R Markdown tutorial](https://happygitwithr.com/rmd-test-drive.html)