-
Notifications
You must be signed in to change notification settings - Fork 0
/
extra-chunks.Rmd
140 lines (108 loc) · 3.4 KB
/
extra-chunks.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
---
class: left
### El
###
# Holds chunks to be inserted later
---
class: left
### A bad function
- One example in base R is the **NROW** or **NCOL** functions. They return a value even if the
input is a vector or single value instead of a data frame:
```{r nrow-example, eval = TRUE, echo = TRUE}
df <- data.frame(a = 1:3, b = letters[1:3])
NROW(df)
vec <- letters[1:10]
NROW(vec)
single <- "A"
NROW(single)
```
- **NEVER** use these functions or be tempted to create such functions.
---
class: center
### R is hard to learn (2)
.left[
Matrices and data.frames are not the same, they are stored internally in two completely different ways
- data.frames are stored as a list of columns (which are vectors)
- matrices are stored as a sequential 1-dimensional vector with dimension information so R knows how to break the vector up into 2-dimensions
- because of this, matrices are much faster for math operations
- data.frames use `names()` to get or set the column names
- matrices use `colnames()` to get or set the column names
A matrix is a linear vector
```{r matrix-impl, echo = TRUE, eval = TRUE}
mat <- matrix(1:9, nrow = 3)
print(mat)
lobstr::sxp(mat)
```
]
---
class: center
### R is hard to learn (3)
.left[
A data.frame is a list of column data
```{r dataframe-impl,echo = TRUE, eval = TRUE}
df <- data.frame(x = c(1, 2, 3), y = c(4, 5, 6), z = c(7, 8, 9))
print(df)
lobstr::sxp(df)
```
Some functions designed to work with data.frames will fail with matrices, and vice-versa. Worse, some may succeed but give you the wrong output.
]
---
class: center
### There are no operators
.left[
There are no operators in R, only functions. For example
```{r operators, echo = TRUE, eval = TRUE}
5 + 8 * 10
```
is actually called like this by R
```{r non-operators, echo = TRUE, eval = TRUE}
`+`(5, `*`(8, 10))
```
Here, `` `r '\x60+\x60()'` `` and `` `r '\x60*\x60()'` `` are functions taking two arguments each. All operators are actually functions like this.
To prove these two lines of code are equivalent, look at the Abstract Syntax Trees (ASTs) that R creates when it parses an expression like this:
]
.left[
.pull-left[
```{r ast-operators, echo = TRUE, eval = TRUE}
lobstr::ast({5 + 8 * 10})
```
]
.pull-right[
```{r ast-non-operators, echo = TRUE, eval = TRUE}
lobstr::ast({`+`(5, `*`(8, 10))})
```
]
]
---
class: center
### `if`, `else`, and all left braces (`[`, `(`, and `{`) are functions
.left[
Almost everything in R is a function. This can help you when trying to understand code
```{r simple-if-statement, echo = TRUE, eval = TRUE}
t_or_f <- sample(c(TRUE, FALSE), size = 1)
if(t_or_f){
x <- -9
y <- -99
z <- -999
}else{
x <- 10
y <- 100
z <- 1000
}
paste0("t_or_f = ", t_or_f, ", x = ", x, ", y = ", y, ", z = ", z)
```
Can be written like this:
```{r exotic-if-statement, echo = TRUE, eval = TRUE}
if(t_or_f) `{`(x <- -9, y <- -99, z <- -999) else(`{`(x <- 10, y <- 100, z <- 1000))
paste0("t_or_f = ", t_or_f, ", x = ", x, ", y = ", y, ", z = ", z)
```
This is because `if`, `else`, and `{` are all functions so an `if..else` statement is implemented as this:
`if()` `` `r '\x60{\x60()'` `` `` `r 'else(\x60{\x60())'` `` and `` `r '\x60{\x60()'` `` has each line of code inside as an argument.
]
---
class: left
### Variable naming syntax
- Inconsistent function names:
- `names()` vs `colnames()`
- `row.names()` vs `rownames()`
- `Sys.time()` vs `system.time()`