-
Notifications
You must be signed in to change notification settings - Fork 46
/
Copy pathchapter9.Rmd
755 lines (592 loc) · 17.1 KB
/
chapter9.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
---
title: Programming Basics
description: Programming Basics
---
## Conditionals
```yaml
type: MultipleChoiceExercise
key: 77691772c6
lang: r
xp: 50
skills:
- 1
```
What will this conditional expression return?
Run it from the console.
```{r}
x <- c(1,2,-3,4)
if(all(x>0)){
print("All Positives")
} else{
print("Not All Positives")
}
```
`@possible_answers`
- All Positives
- Not All Positives
- N/A
- None of the above
`@hint`
Are the numbers stored in `x` all positive? If not the `print` after the `else` is evaluated.
`@pre_exercise_code`
```{r}
# no pec
```
`@sct`
```{r}
msg1 = "Are all the entries in `x` positive?"
msg2 = "Awesome! That is correct."
msg3 = "In R you can vectorize logical relations like this. There is no reason to have `NA`s"
msg4 = "If `all(x>0)` is `TRUE` or `FALSE` something is returned."
test_mc(correct = 2, feedback_msgs = c(msg1,msg2,msg3,msg4))
```
---
## Conditional continued
```yaml
type: MultipleChoiceExercise
key: 78dfb810dd
lang: r
xp: 50
skills:
- 1
```
Which of the following expressions is always `FALSE` when at least one entry of a logical vector x is `TRUE`? You can try examples in the R console.
`@possible_answers`
- all(x)
- any(x)
- any(!x)
- all(!x)
`@hint`
Define a logical vector `x` such as `c(TRUE, FALSE)` and `c(TRUE, TRUE)` and test them.
`@pre_exercise_code`
```{r}
# no pec
```
`@sct`
```{r}
msg1 = "Try `x <- c(TRUE, FALSE)` or `x <- c(TRUE, TRUE)`."
msg2 = "Try `x <- c(TRUE, TRUE)`"
msg3 = "Try `x <- c(TRUE, FALSE)"
msg4 = "Good job! Let's move to the next question."
test_mc(correct = 4, feedback_msgs = c(msg1,msg2,msg3,msg4))
```
---
## ifelse
```yaml
type: NormalExercise
key: 04735d26b3
lang: r
xp: 100
skills:
- 1
```
The function `nchar` tells you how many characters long a character vector is. For example:
```{r}
char_len <- nchar(murders$state)
head(char_len)
```
The function `ifelse` is useful because you convert a vector of logicals into something else. For example, some datasets use the number -999 to denote NA. A bad practice! You can convert the -999 in a vector to NA using the following `ifelse` call:
```{r}
x <- c(2, 3, -999, 1, 4, 5, -999, 3, 2, 9)
ifelse(x == -999, NA, x)
```
If the entry is -999 it returns NA, otherwise it returns the entry.
`@instructions`
We will combine a number of functions for this exercise.
- Use the `ifelse` function to write one line of code that assigns to the object `new_names` the state abbreviation when the state name is longer than 8 characters and assigns the state name when the name is not longer than 8 characters.
For example, where the original vector has Massachusetts (13 characters), the new vector should have MA. But where the original vector has New York (8 characters), the new vector should have New York as well.
`@hint`
Use a call to `nchar` as the first argument of the `ifelse` function to calculate. Check to see if it's larger than 8. Then you use `murders$abb` and `murders$state` in the second and third arguments to return either the abbreviation or state name depending on the number of characters.
`@pre_exercise_code`
```{r}
library(dslabs)
data(murders)
```
`@sample_code`
```{r}
# Assign the state abbreviation when the state name is longer than 8 characters
```
`@solution`
```{r}
# Assign the state abbreviation when the state name is longer than 8 characters
new_names <- ifelse(nchar(murders$state)>8, murders$abb, murders$state)
```
`@sct`
```{r}
test_error()
test_function("ifelse")
test_object("new_names", undefined_msg = "Make sure you define new_names!", incorrect_msg = "It has to be one line of code. Combine nchar and ifelse. The objects being used inside the function should be `murders$state` and `murders$abb`. Also we are looking for `nchar(murders$state)` strictly larger than 8.")
success_msg("Woohoo! You're becoming a pro at this!")
```
---
## Defining functions
```yaml
type: NormalExercise
key: 3d324d0749
lang: r
xp: 100
skills:
- 1
```
You will encounter situations in which the function you need does not already exist. R permits you to write your own. Let's practice one such situation, in which you first need to define the function to be used. The functions you define can have multiple arguments as well as default values.
To define functions we use `function`. For example the following function adds 1 to the number it receives as an argument:
```{r}
my_func <- function(x){
y <- x + 1
y
}
```
The last value in the function, in this case that stored in `y`, gets returned.
If you run the code above R does not show anything. This means you defined the function. You can test it out like this:
```{r}
my_func(5)
```
`@instructions`
We will define a function `sum_n` for this exercise.
- Create a function `sum_n` that for any given value, say `n`, creates the vector `1:n`, and then computes the sum of the integers from 1 to n.
- Use the function you just defined to determine the sum of integers from 1 to 5,000.
`@hint`
- To make it inclusive, use {}.
- You can define the function like this:
```{r}
sum_n <- function(n){
x <- 1:n
sum(x)
}
```
`@pre_exercise_code`
```{r}
# no pec
```
`@sample_code`
```{r}
# Create function called `sum_n`
# Use the function to determine the sum of integers from 1 to 5000
```
`@solution`
```{r}
# Create function called `sum_n`
sum_n <- function(n){
x <- 1:n
sum(x)
}
# Determine the sum of integers from 1 to 5000
sum_n(5000)
```
`@sct`
```{r}
test_error()
test_function("sum_n", incorrect_msg = "You need to check for the sum of 5000 integers!")
fun_def <- ex() %>% check_fun_def("sum_n")
fun_def %>% check_arguments()
fun_def %>% check_call(5000) %>% check_result() %>% check_equal()
fun_def %>% check_call(100) %>% check_result() %>% check_equal()
fun_def %>% check_call(1000) %>% check_result() %>% check_equal()
fun_def %>% check_body()
success_msg("This is awesome! Let's get to the next exercise.")
```
---
## Defining functions continued...
```yaml
type: NormalExercise
key: ea64484119
lang: r
xp: 100
skills:
- 1
```
We will make another function for this exercise. We will define a function `altman_plot` that takes two arguments `x` and `y` and plots the difference `y-x` in the y-axis against the sum `x+y` in the x-axis.
You can define functions with as many variables as you want. For example, here we need at least two, `x` and `y`. The following function plots log transformed values:
```{r}
log_plot <- function(x, y){
plot(log10(x), log10(y))
}
```
This function does not return anything. It just makes a plot.
`@instructions`
We will make another function for this exercise.
- Create a function `altman_plot` that takes two arguments `x` and `y` and plots `y-x` (on the y-axis) against `x+y` (on the x-axis).
- Note: don't use parentheses around the arguments in the `plot` function because you will confuse R.
`@hint`
The command inside the function should be: `plot(x + y, y - x)`. Remember to define a function you use
```{r}
altman_plot <- function(var_1, var_2){
# code that uses var_1 and var_2
}
```
`@pre_exercise_code`
```{r}
# no pec
```
`@sample_code`
```{r}
# Create `altman_plot`
```
`@solution`
```{r}
# Create `altman_plot`
altman_plot <- function(x, y){
plot(x + y, y - x)
}
```
`@sct`
```{r}
test_error()
test_function_definition("altman_plot",
body_test = {
test_function("plot", incorrect_msg = "Make sure you call `plot` inside your function")
},
undefined_msg = "Be sure to define `altman_plot`.",
incorrect_number_arguments_msg = "Make sure the function takes two arguments.")
test_object("altman_plot", incorrect_msg = "Make sure you use `x` and `y` as your arguments and that you plot the difference `y-x` versus the sum `x+y`. Please write the sum as `x+y` not `y+x`. Also, for this exercise, we want you to use the curly brackets `{ }` as in the example. Do not use () around the arguments in the `plot` function because you will confuse R.")
success_msg("That's great! You can also play around with the variables and put in values for x and y.")
```
---
## Lexical scope
```yaml
type: NormalExercise
key: f13e10ed3c
lang: r
xp: 100
skills:
- 1
```
Lexical scoping is a convention used by many languages that determine when an object is available by its name. When you run the code below you will see which `x` is available at different points in the code.
```{r}
x <- 8
my_func <- function(y){
x <- 9
print(x)
y + x
}
my_func(x)
print(x)
```
Note that when we define `x` as 9, this is inside the function, but it is 8 after you run the function. The `x` changed inside the function but not outside.
`@instructions`
After running the code below, what is the value of `x`?
```{r}
x <- 3
my_func <- function(y){
x <- 5
y
print(x)
}
my_func(x)
```
`@hint`
Run the code and then `print(x)`. You will have the answer.
`@pre_exercise_code`
```{r}
x <- 3
```
`@sample_code`
```{r}
# Run this code
x <- 3
my_func <- function(y){
x <- 5
y+5
}
# Print the value of x
```
`@solution`
```{r}
# Show value of x
x
```
`@sct`
```{r}
test_error()
test_object("x", incorrect_msg = "How do you print the value of x?")
success_msg("Good job!")
```
---
## For loops
```yaml
type: NormalExercise
key: efe45f5c56
lang: r
xp: 100
skills:
- 1
```
In the next two exercises, we are going to write a for-loop. In that for-loop we are going to call a function. We start by defining that function here, in this exercise. We will call the for loop in the next exercise.
`@instructions`
- Write a function `compute_s_n` that for any given $n$ computes the sum $S_n = 1^2 + 2^2 + 3^2 + \dots + n^2$.
- Report the value of the sum when `n=10`.
`@hint`
The function will look something like this:
```{r}
compute_s_n <- function(n){
x <- 1:n
## sum the vector after you square it
}
```
Then you evaluate it at 10: `compute_s_n(10)`
`@pre_exercise_code`
```{r}
# no pec
```
`@sample_code`
```{r}
# Here is an example of a function that adds numbers from 1 to n
example_func <- function(n){
x <- 1:n
sum(x)
}
# Here is the sum of the first 100 numbers
example_func(100)
# Write a function compute_s_n with argument n that for any given n computes the sum of 1 + 2^2 + ...+ n^2
# Report the value of the sum when n=10
```
`@solution`
```{r}
# Here is a function that adds numbers from 1 to n
example_func <- function(n){
x <- 1:n
sum(x)
}
# Here is the sum of the first 100 numbers
example_func(100)
# Write a function compute_s_n with argument n that for any given n computes the sum of 1 + 2^2 + ...+ n^2
compute_s_n <- function(n){
x <- 1:n
sum(x^2)
}
# Report the value of the sum when n=10
compute_s_n(10)
```
`@sct`
```{r}
test_error()
test_function_result("compute_s_n", incorrect_msg ="Make sure your function is producing the correct output.")
test_output_contains("compute_s_n(10)", incorrect_msg = "Make sure your function is producing the correct output.")
test_student_typed("compute_s_n(10)", not_typed_msg = "Make sure you evaluate the function at 10 in the last line.")
fun_def <- ex() %>% check_fun_def("compute_s_n")
fun_def %>% check_arguments()
fun_def %>% check_call(10) %>% check_result() %>% check_equal()
fun_def %>% check_call(100) %>% check_result() %>% check_equal()
fun_def %>% check_call(1000) %>% check_result() %>% check_equal()
fun_def %>% check_body()
success_msg("That's awesome! You're getting so good at this!")
```
---
## For loops continued...
```yaml
type: NormalExercise
key: a9bf58a40c
lang: r
xp: 100
skills:
- 1
```
Now we are going to compute the sum of the squares for several values of $n$. We will use a for-loop for this. Here is an example of a for-loop:
```{r}
results <- vector("numeric", 10)
n <- 10
for(i in 1:n){
x <- 1:i
results[i] <- sum(x)
}
```
Note that we start with a call to `vector` which constructs an empty vector that we will fill while the loop runs.
`@instructions`
- Define an empty numeric vector `s_n` of size 25 using `s_n <- vector("numeric", 25)`.
- Compute the the sum when n is equal to each integer from 1 to 25 using the function we defined in the previous exercise: `compute_s_n`
- Save the results in `s_n`
`@hint`
The line of code inside the function will be `s_n[i] <- compute_s_n(i)`.
`@pre_exercise_code`
```{r}
# no pec
```
`@sample_code`
```{r}
# Define a function and store it in `compute_s_n`
compute_s_n <- function(n){
x <- 1:n
sum(x^2)
}
# Create a vector for storing results
s_n <- vector("numeric", 25)
# write a for-loop to store the results in s_n
```
`@solution`
```{r}
# Define a function and store it in `compute_s_n`
compute_s_n <- function(n){
x <- 1:n
sum(x^2)
}
# Create a vector for storing results
s_n <- vector("numeric", 25)
# Assign values to `n` and `s_n`
for(i in 1:25){
s_n[i] <- compute_s_n(i)
}
```
`@sct`
```{r}
test_error()
test_object("compute_s_n", undefined_msg = "Make sure to define compute_s_n first.", incorrect_msg = "Are you assigning the correct values to x? ")
test_object("s_n", incorrect_msg = "You are getting the incorrect values. Check that you are calling the function correctly.")
test_function("vector")
fun_def <- ex() %>% check_fun_def("compute_s_n")
fun_def %>% check_arguments()
fun_def %>% check_call(10) %>% check_result() %>% check_equal()
fun_def %>% check_call(100) %>% check_result() %>% check_equal()
fun_def %>% check_call(1000) %>% check_result() %>% check_equal()
fun_def %>% check_body()
for_loop <- ex() %>% check_for()
# check condition ("i in 1:5" in the solution)
# note that check_code might be too stringent, since someone could
# create a variable, e.g. a <- 1:5, and give that to the for loop
for_loop %>% check_cond()
#%>% check_code("in n", fixed = TRUE)
success_msg("This is great! You now know the basics of for loops in R.")
```
---
## Checking our math
```yaml
type: NormalExercise
key: 4e649b5400
lang: r
xp: 100
skills:
- 1
```
If we do the math, we can show that
$$S_n = 1^2 + 2^2 + 3^2 + \dots + n^2 = n(n+1)(2n+1)/6 $$
We have already computed the values of $S_n$ from 1 to 25 using a for loop.
If the formula is correct then a plot of $S_n$ versus $n$ should look cubic.
Let's make this plot.
`@instructions`
- Define `n <- 1:25`. Note that with this we can use `for(i in n)`
- Use a for loop to save the sums into a vector `s_n <- vector("numeric", 25)`
- Plot `s_n` (on the y-axis) against `n` (on the x-axis).
`@hint`
In the plot code, `n` is the x-axis, and `s_n` is the y-axis.
`@pre_exercise_code`
```{r}
# no pec
```
`@sample_code`
```{r}
# Define the function
compute_s_n <- function(n){
x <- 1:n
sum(x^2)
}
# Define the vector of n
n <- 1:25
# Define the vector to store data
s_n <- vector("numeric", 25)
for(i in n){
s_n[i] <- compute_s_n(i)
}
# Create the plot
```
`@solution`
```{r}
# Define the function
compute_s_n <- function(n){
x <- 1:n
sum(x^2)
}
# Define the vector of n
n <- 1:25
# Define the vector to store data
s_n <- vector("numeric", 25)
for(i in n){
s_n[i] <- compute_s_n(i)
}
# Create the plot
plot(n, s_n)
```
`@sct`
```{r}
test_error()
test_function("plot",args=c("x","y"), incorrect_msg = "Did you plot the x and y axis correctly?")
success_msg("Awesome! Let's go to the last exercise in this chapter!")
```
---
## Checking our math continued
```yaml
type: NormalExercise
key: 33ed29e3c1
lang: r
xp: 100
skills:
- 1
```
Now let's actually check if we get the exact same answer.
`@instructions`
- Confirm that `s_n` and $n(n+1)(2n+1)/6$ are the same using the `identical` command.
`@hint`
`s_n` is a vector. Because `n` is a vector `n*(n+1)*(2*n+1)/6` is a vector. You can compare two vectors `x` and `y` with `identical(x, y)`.
`@pre_exercise_code`
```{r}
# no pec
```
`@sample_code`
```{r}
# Define the function
compute_s_n <- function(n){
x <- 1:n
sum(x^2)
}
# Define the vector of n
n <- 1:25
# Define the vector to store data
s_n <- vector("numeric", 25)
for(i in n){
s_n[i] <- compute_s_n(i)
}
# Check that s_n is identical to the formula given in the instructions.
```
`@solution`
```{r}
# Define the function
compute_s_n <- function(n){
x <- 1:n
sum(x^2)
}
# Define the vector of n
n <- 1:25
# Define the vector to store data
s_n <- vector("numeric", 25)
for(i in n){
s_n[i] <- compute_s_n(i)
}
# Check that s_n is identical to the formula given in the instructions.
identical(s_n, n*(n+1)*(2*n+1)/6)
```
`@sct`
```{r}
test_error()
test_function("identical")
test_output_contains("identical(s_n, n*(n+1)*(2*n+1)/6)", incorrect_msg = "Make sure you're checking your formula!")
success_msg("This is great! We are done with this module. Time to move on to bigger things!")
```
---
## End of Assessment 9
```yaml
type: PureMultipleChoiceExercise
key: 226fb8de61
lang: r
xp: 50
skills:
- 1
```
This is the end of the programming assignment for this section. Please DO NOT click through to additional assessments from this page. Please DO answer the question on this page. If you do click through, your scores may NOT be recorded.
Click on "Awesome" to get the "points" for this question and then return to the course on edX.
You can now close this window to go back to the <a href='https://www.edx.org/course/data-science-r-basics-2'>course</a>.
`@hint`
- No hint necessary!
`@possible_answers`
- [Awesome]
- Nope
`@feedback`
- Great! Now navigate back to the course on edX!
- Now navigate back to the course on edX!