forked from cis-ds/course-site
-
Notifications
You must be signed in to change notification settings - Fork 0
/
cm003.Rmd
43 lines (29 loc) · 2.41 KB
/
cm003.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
---
title: "Data transformation and exploratory data analysis"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(cache=TRUE)
```
# cm003 - October 3, 2016
## Overview
* Identify computer programming as a form of problem solving
* Practice decomposing an analytical goal into a set of discrete, computational tasks
* Identify the verbs for a language of data manipulation
* Clarify confusing aspects of data transformation from [R for Data Science](http://r4ds.had.co.nz/transform.html)
* Define *exploratory data analysis* (EDA) and types of pattern exploration
* Demonstrate types of graphs useful for EDA and precautions when interpreting them
* Practice transforming and exploring data using Department of Education College Scorecard data
## Slides and links
* [Slides](extras/cm003_slides.html)
* [cm003_scorecard_practice.R](https://gist.github.com/bensoltoff/ffd6582e5fbd9f345f72034bbfa31be5) - in-class practice activity
* [Solution set for `scorecard` activity](extras/cm003_scorecard_tutorial.html)
### Cheat sheets
* [Data Visualization with `ggplot2` Cheat Sheet](https://www.rstudio.com/wp-content/uploads/2015/12/ggplot2-cheatsheet-2.0.pdf)
* [Data Wrangling with `dplyr` and `tidyr` Cheat Sheet](https://www.rstudio.com/wp-content/uploads/2015/02/data-wrangling-cheatsheet.pdf)
## To do for Wednesday
* [Register your GitHub account for the class](https://docs.google.com/forms/d/e/1FAIpQLSdBjiacwzO8WexQNJRW-dBuzc4lzj-AkAdm1FUqe5O2_kkbWg/viewform) - all remaining homework assignments will be in *private repositories*. Private repos can only be seen and edited by members of our [course organization](https://github.com/uc-cfss). Once you register your GitHub account, I will invite you to join the course organization. If you don't register your account, you won't have access to any of the homework assignments.
* [Submit homework 1](hw01_edit-README.html)
* Chapters 9-13 from [R for Data Science](http://r4ds.had.co.nz/)
* Lohr. 2014. [For Big-Data Scientists, "Janitor Work" Is Key Hurdle to Insights.](http://www.nytimes.com/2014/08/18/technology/for-big-data-scientists-hurdle-to-insights-is-janitor-work.html?_r=0) *New York Times*.
* Start thinking about your topic for your [final project](project_description.html). If you want to do a group project, start identifying project partners or post on the [discussion board](https://github.com/uc-cfss/Discussion) to find classmates with similar interests.