forked from 2DegreesInvesting/ds-incubator
-
Notifications
You must be signed in to change notification settings - Fork 0
/
2019-12-03_avoid-hidden-arguments.Rmd
129 lines (85 loc) · 2.51 KB
/
2019-12-03_avoid-hidden-arguments.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
---
output: github_document
---
# Avoid hidden arguments
-- [Tidyverse design guide](https://principles.tidyverse.org/args-hidden.html)
<https://twitter.com/mauro_lepore>
License: [CCO](https://creativecommons.org/choose/zero/?lang=es)
## Hidden arguments make code harder to reason about, because to correctly predict the output you also need to know some other state
```{r}
y <- 1
add <- function(x) {
x + y
}
add(1)
y <- 10 ## It is hard to keep track of this
add(1)
```
## Functions are easier to understand if the results depend only on the values of the inputs
## How can I remediate the problem?
If you have an existing function with a hidden input:
1. Make sure the input is an explicit option.
2. Make sure it’s printed.
## For example, take `prepare_data()`
The output depends on `data`, but it is hidden.
```{r}
prepare_data <- function() {
data <- read.csv(path)
data[1:2, 1:2]
}
path <- tempfile()
readr::write_csv(mtcars, path)
prepare_data()
```
## 1. `prepare_data()` gains the explicit argument `data`
```{r}
prepare_data <- function(data = read.csv(path)) {
data[1:2, 1:2]
}
prepare_data()
```
## 2. `prepare_data()` now prints `data`
```{r}
prepare_data <- function(data = read.csv(path)) {
if (missing(data)) {
message(
"Using `data` with names: ", paste(names(data), collapse = ", ")
)
}
data[1:2, 1:2]
}
prepare_data()
prepare_data(read.csv(path))
```
## But `data` should be supplied
> Data arguments provide the core data. They are required, and are usually vectors and often determine the type and size of the output. Data arguments are often called `data`, `x`, or `y` -- [tidyverse design guide](https://principles.tidyverse.org/args-data-details.html).
```{r}
prepare_data <- function(data) {
data[1:2, 1:2]
}
try(prepare_data())
data <- read.csv(path)
prepare_data(data)
```
# Some functions do need to depend on external state ...
## A function has hidden arguments when it returns different results with the same inputs in a surprising way
## Surprising
```{r}
getOption("stringsAsFactors")
data.frame(x = "a")$x
old_options <- options(stringsAsFactors = FALSE)
on.exit(old_options)
getOption("stringsAsFactors")
data.frame(x = "a")$x
```
Global options should not affect computation.
## Not surprising
`read_csv(path)` depends not only on `path` but also on the contents of the file, but that is not surprising.
```{r, message=FALSE}
library(readr)
path <- tempfile()
write_csv(mtcars, path)
names(read_csv(path))
write_csv(iris, path)
names(read_csv(path))
```