forked from cis-ds/course-site
-
Notifications
You must be signed in to change notification settings - Fork 0
/
cm003.html
267 lines (215 loc) · 7.53 KB
/
cm003.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta charset="utf-8">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="generator" content="pandoc" />
<title>Data transformation and exploratory data analysis</title>
<script src="site_libs/jquery-1.11.3/jquery.min.js"></script>
<meta name="viewport" content="width=device-width, initial-scale=1" />
<link href="site_libs/bootstrap-3.3.5/css/readable.min.css" rel="stylesheet" />
<script src="site_libs/bootstrap-3.3.5/js/bootstrap.min.js"></script>
<script src="site_libs/bootstrap-3.3.5/shim/html5shiv.min.js"></script>
<script src="site_libs/bootstrap-3.3.5/shim/respond.min.js"></script>
<script>
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-45631879-2', 'auto');
ga('send', 'pageview');
</script>
<style type="text/css">
h1 {
font-size: 34px;
}
h1.title {
font-size: 38px;
}
h2 {
font-size: 30px;
}
h3 {
font-size: 24px;
}
h4 {
font-size: 18px;
}
h5 {
font-size: 16px;
}
h6 {
font-size: 12px;
}
.table th:not([align]) {
text-align: left;
}
</style>
</head>
<body>
<style type = "text/css">
.main-container {
max-width: 940px;
margin-left: auto;
margin-right: auto;
}
code {
color: inherit;
background-color: rgba(0, 0, 0, 0.04);
}
img {
max-width:100%;
height: auto;
}
.tabbed-pane {
padding-top: 12px;
}
button.code-folding-btn:focus {
outline: none;
}
</style>
<style type="text/css">
/* padding for bootstrap navbar */
body {
padding-top: 66px;
padding-bottom: 40px;
}
/* offset scroll position for anchor links (for fixed navbar) */
.section h1 {
padding-top: 71px;
margin-top: -71px;
}
.section h2 {
padding-top: 71px;
margin-top: -71px;
}
.section h3 {
padding-top: 71px;
margin-top: -71px;
}
.section h4 {
padding-top: 71px;
margin-top: -71px;
}
.section h5 {
padding-top: 71px;
margin-top: -71px;
}
.section h6 {
padding-top: 71px;
margin-top: -71px;
}
</style>
<script>
// manage active state of menu based on current page
$(document).ready(function () {
// active menu anchor
href = window.location.pathname
href = href.substr(href.lastIndexOf('/') + 1)
if (href === "")
href = "index.html";
var menuAnchor = $('a[href="' + href + '"]');
// mark it active
menuAnchor.parent().addClass('active');
// if it's got a parent navbar menu mark it active as well
menuAnchor.closest('li.dropdown').addClass('active');
});
</script>
<div class="container-fluid main-container">
<!-- tabsets -->
<script src="site_libs/navigation-1.1/tabsets.js"></script>
<script>
$(document).ready(function () {
window.buildTabsets("TOC");
});
</script>
<!-- code folding -->
<div class="navbar navbar-default navbar-fixed-top" role="navigation">
<div class="container">
<div class="navbar-header">
<button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#navbar">
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<a class="navbar-brand" href="index.html">Computing for the Social Sciences</a>
</div>
<div id="navbar" class="navbar-collapse collapse">
<ul class="nav navbar-nav">
<li>
<a href="index.html">Home</a>
</li>
<li>
<a href="faq.html">FAQ</a>
</li>
<li>
<a href="syllabus.html">Syllabus</a>
</li>
</ul>
<ul class="nav navbar-nav navbar-right">
</ul>
</div><!--/.nav-collapse -->
</div><!--/.container -->
</div><!--/.navbar -->
<div class="fluid-row" id="header">
<h1 class="title toc-ignore">Data transformation and exploratory data analysis</h1>
</div>
<div id="cm003---october-3-2016" class="section level1">
<h1>cm003 - October 3, 2016</h1>
<div id="overview" class="section level2">
<h2>Overview</h2>
<ul>
<li>Identify computer programming as a form of problem solving</li>
<li>Practice decomposing an analytical goal into a set of discrete, computational tasks</li>
<li>Identify the verbs for a language of data manipulation</li>
<li>Clarify confusing aspects of data transformation from <a href="http://r4ds.had.co.nz/transform.html">R for Data Science</a></li>
<li>Define <em>exploratory data analysis</em> (EDA) and types of pattern exploration</li>
<li>Demonstrate types of graphs useful for EDA and precautions when interpreting them</li>
<li>Practice transforming and exploring data using Department of Education College Scorecard data</li>
</ul>
</div>
<div id="slides-and-links" class="section level2">
<h2>Slides and links</h2>
<ul>
<li><a href="extras/cm003_slides.html">Slides</a></li>
<li><a href="https://gist.github.com/bensoltoff/ffd6582e5fbd9f345f72034bbfa31be5">cm003_scorecard_practice.R</a> - in-class practice activity</li>
<li><a href="extras/cm003_scorecard_tutorial.html">Solution set for <code>scorecard</code> activity</a></li>
</ul>
<div id="cheat-sheets" class="section level3">
<h3>Cheat sheets</h3>
<ul>
<li><a href="https://www.rstudio.com/wp-content/uploads/2015/12/ggplot2-cheatsheet-2.0.pdf">Data Visualization with <code>ggplot2</code> Cheat Sheet</a></li>
<li><a href="https://www.rstudio.com/wp-content/uploads/2015/02/data-wrangling-cheatsheet.pdf">Data Wrangling with <code>dplyr</code> and <code>tidyr</code> Cheat Sheet</a></li>
</ul>
</div>
</div>
<div id="to-do-for-wednesday" class="section level2">
<h2>To do for Wednesday</h2>
<ul>
<li><a href="https://docs.google.com/forms/d/e/1FAIpQLSdBjiacwzO8WexQNJRW-dBuzc4lzj-AkAdm1FUqe5O2_kkbWg/viewform">Register your GitHub account for the class</a> - all remaining homework assignments will be in <em>private repositories</em>. Private repos can only be seen and edited by members of our <a href="https://github.com/uc-cfss">course organization</a>. Once you register your GitHub account, I will invite you to join the course organization. If you don’t register your account, you won’t have access to any of the homework assignments.</li>
<li><a href="hw01_edit-README.html">Submit homework 1</a></li>
<li>Chapters 9-13 from <a href="http://r4ds.had.co.nz/">R for Data Science</a></li>
<li>Lohr. 2014. <a href="http://www.nytimes.com/2014/08/18/technology/for-big-data-scientists-hurdle-to-insights-is-janitor-work.html?_r=0">For Big-Data Scientists, “Janitor Work” Is Key Hurdle to Insights.</a> <em>New York Times</em>.</li>
<li>Start thinking about your topic for your <a href="project_description.html">final project</a>. If you want to do a group project, start identifying project partners or post on the <a href="https://github.com/uc-cfss/Discussion">discussion board</a> to find classmates with similar interests.</li>
</ul>
</div>
</div>
<p>This work is licensed under the <a href="http://creativecommons.org/licenses/by-nc/4.0/">CC BY-NC 4.0 Creative Commons License</a>.</p>
</div>
<script>
// add bootstrap table styles to pandoc tables
$(document).ready(function () {
$('tr.header').parent('thead').parent('table').addClass('table table-condensed');
});
</script>
<!-- dynamically load mathjax for compatibility with self-contained -->
<script>
(function () {
var script = document.createElement("script");
script.type = "text/javascript";
script.src = "https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML";
document.getElementsByTagName("head")[0].appendChild(script);
})();
</script>
</body>
</html>