forked from cis-ds/course-site
-
Notifications
You must be signed in to change notification settings - Fork 0
/
hw00_homework_guidelines.Rmd
95 lines (62 loc) · 7.04 KB
/
hw00_homework_guidelines.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
---
title: General Homework Guidelines
output:
html_document:
toc: true
toc_float: true
---
# GitHub prerequisites
I assume you can [pull from and push to GitHub from RStudio](git07.html).
# Homework workflow
![](images/homework_workflow.png)
Homework assignments will be stored in separate Git repositories under the `uc-cfss` organization on GitHub. To complete a homework assignment, you need to:
1. [Fork](https://guides.github.com/activities/forking/) the repository
1. [Clone](git05.html#step_2:_clone_the_new_github_repository_to_your_computer_via_rstudio) the repository to your computer
1. Modify the files and [commit changes](git05.html#step_3:_make_local_changes,_save,_commit) to complete your solution.
1. [Push](git05.html#step_4:_push_your_local_changes_online_to_github)/sync the changes up to GitHub.
1. [Create a pull request](https://help.github.com/articles/creating-a-pull-request) on the original repository to turn in the assignment. *Make sure to include your name in the pull request.*
# Authoring Markdown files
Throughout this course, any basic text document should be written in [Markdown](http://daringfireball.net/projects/markdown/basics) and should always have a filename that ends in `.md`. These files are pleasant to write and read, but are also easily converted into HTML and other output formats. GitHub provides an attractive HTML-like preview for Markdown documents. RStudio's "Preview HTML" button will compile the open document to actual HTML and open a preview.
Whenever you are editing Markdown documents in RStudio, you can display a Markdown cheatsheet by going to Help > Markdown Quick Reference.
# Authoring R Markdown files
If your document is describing a data analysis, author it in [R Markdown](http://rmarkdown.rstudio.com), which is like Markdown, but with the addition of R "code chunks" that are runnable. The filename should end in `.Rmd` or `.rmd`. RStudio's "Knit HTML" button will compile the open document to actual HTML and open a preview.
Whenever you are editing R Markdown documents in RStudio, you can display an R Markdown cheatsheet by going to Help > Cheatsheets > R Markdown Cheat Sheet. A basic introduction to R Markdown can also be found in [R for Data Science](http://r4ds.had.co.nz/r-markdown.html)
# Which files to commit
* Always commit the main source document, e.g., the R script or R Markdown or Markdown document. Commit early, commit often!
* For R Markdown source, also commit the intermediate Markdown (`.md`) file and any accompaying files, such as figures.
* Some purists would say intermediate and downstream products do NOT belong in the repo. After all, you can always recreate them from source, right? But here in reality, it turns out to be incredibly handy to have this in the repo.
* Commit the end product HTML (`.html`) file.
* See above comment re: version control purists vs. pragmatists.
* You may not want to commit the Markdown and HTML until the work is fairly advanced, maybe even until submission. Once these enter the repo, you really should recompile them each time you commit changes to the R Markdown source, so that the Git history reflects the way these files should evolve as an ensemble.
* **Never ever** edit the intermediate/output documents "by hand". Only edit the source and then regenerate the downstream products from that.
# Make your work shine!
Here are some minor tweaks that can make a big difference in how awesome your product is.
## Make it easy for people to access your work
Reduce the friction for graders to get the hard-working source code (the R markdown) **and** the front-facing report (HTML).
* Create a `README.md` in the homework's main directory to serve as the landing page for your submission. Whenever anyone visits this repo, this will be automatically rendered nicely! In particular, hyperlinks will work.
* With this `README.md` file, create annotated links to the documents graders will need to access. Such as:
* Your main R Markdown document.
* The intermediate Markdown product that comes from knitting your main R Markdown document. Remember GitHub will render this into pseudo-HTML automagically. Remember the figures in `figures/` need to be available in the repo in order appear here.
* The final pretty HTML report. Read instructions below on how access the pretty, not the ugly source.
### Linking to HTML files in the repo
Simply visiting an HTML file in a GitHub repo just shows ugly HTML source. You need to do a little extra work to see this rendered as a proper webpage.
* Navigate to the HTML file on GitHub. Click on "Raw" to get the raw version; the URL should look something like this: `https://raw.github.com/uc-cfss/uc-cfss.github.io/hw00_homework_guidelines.html`. Copy that URL!
* Create a link to that in the usual Markdown way BUT prepend `http://htmlpreview.github.io/?` to the URL. So the URL in your link should look something like this: `http://htmlpreview.github.io/?https://github.com/uc-cfss/uc-cfss.github.io/blob/master/hw00_homework_guidelines.html`. You can learn more about this preview facility [here](http://htmlpreview.github.io).
* This sort of link would be fabulous to include in `README.md`.
## Make it easy for others to run your code
* In exactly one, very early R chunk, load any necessary packages, so your dependencies are obvious.
* In exactly one, very early R chunk, import anything coming from an external file. This will make it easy for someone to see which data files are required, edit to reflect their locals paths if necessary, etc. There are situations where you might not keep data in the repo itself.
* In exactly one, very last R chunk, report your session information. This prints version information about R, the operating system, and loaded packages so the reader knows the state of your machine when you rendered the R Markdown document.^[Unfortunately I know of no easy way to generate similar information using Python.] An R chunk with `devtools::session_info()` will produce something that looks like this:
```{r demo_session_info, echo = FALSE}
devtools::session_info()
```
* Pretend you are someone else. Clone a fresh copy of your own repo from GitHub, fire up a new RStudio session and try to knit your R markdown file. Does it "just work"? It should!
## Make pretty tables
Instead of just printing an object with R, you could format the info in an attractive table. Some leads:
* The `kable()` function from `knitr`.
* Also look into the packages `xtable`, `pander`.
<blockquote class="twitter-tweet" data-lang="en"><p lang="en" dir="ltr">Not sure what <a href="https://twitter.com/hashtag/rstats?src=hash">#rstats</a> package to use to make your lovely Reproducible Table? I made a flowchart for you! <a href="https://twitter.com/hashtag/knitr?src=hash">#knitr</a> <a href="http://t.co/5qC5EBADN4">pic.twitter.com/5qC5EBADN4</a></p>— Andrew MacDonald (@polesasunder) <a href="https://twitter.com/polesasunder/status/464132152347475968">May 7, 2014</a></blockquote>
<script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>
### Acknowledgments {.toc-ignore}
```{r child='_ack_stat545.Rmd'}
```