-
Notifications
You must be signed in to change notification settings - Fork 54
/
Copy pathenumerate-options.qmd
187 lines (137 loc) · 6.33 KB
/
enumerate-options.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
# Enumerate possible options {#sec-enumerate-options}
```{r}
#| include = FALSE
source("common.R")
```
```{r}
#| eval = FALSE,
#| include = FALSE
source("fun_def.R")
pkg_funs("base") %>% funs_formals_keep(~ is_call(.x, "c"))
has_several_ok <- function(x) {
if (is_call(x, "match.arg")) {
x <- call_standardise(x)
isTRUE(x$several.ok)
} else if (is_call(x)) {
some(x[-1], has_several_ok)
} else {
FALSE
}
}
pkg_funs("utils") %>% funs_body_keep(has_several_ok)
```
## What's the pattern?
If the possible values of an argument are a small set of strings, set the default argument to the set of possible values, and then use `match.arg()` or `rlang::arg_match()` in the function body.
This convention advertises to the knowledgeable user[^enumerate-options-1] what the possible values, and makes it easy to generate an informative error message for inappropriate inputs.
This interface is often coupled with an implementation that uses `switch()`.
[^enumerate-options-1]: The main downside of this technique is that many users aren't aware of this convention and that the first value of the vector will be used as a default.
This convention makes it possible to advertise the possible set of values for an argument.
The advertisement happens in the function specification, so you see in tool tips and autocomplete, without having to look at the documentation.
## What are some examples?
- In `difftime()`, `units` can be any one of "auto", "secs", "mins", "hours", "days", or "weeks".
- In `format()`, `justify` can be "left", "right", "center", or "none".
- In `trimws()`, you can choose `which` side to remove whitespace from: "both", "left", or "right".
- In `rank()`, you can select the `ties.method` from one of "average", "first", "last", "random", "max", or "min".
- `rank()` exposes six different methods for handling ties with the `ties.method` argument.
- `quantile()` exposes nine different approaches to computing a quantile through the `type` argument.
- `p.adjust()` exposes eight strategies for adjusting P values to account for multiple comparisons using the `p.adjust.methods` argument.
## How do I use this pattern?
To use this technique, set the default value to a character vector, where the first value is the default.
Inside the function, use `match.arg()` or `rlang::arg_match()` to check that the value comes from the known good set, and pick the default if none is supplied.
Take `rank()`, for example.
The heart of its implementation looks like this:
```{r}
rank <- function(
x,
ties.method = c("average", "first", "last", "random", "max", "min")
) {
ties.method <- match.arg(ties.method)
switch(ties.method,
average = ,
min = ,
max = .Internal(rank(x, length(x), ties.method)),
first = sort.list(sort.list(x)),
last = sort.list(rev.default(sort.list(x, decreasing = TRUE))),
random = sort.list(order(x, stats::runif(length(x))))
)
}
x <- c(1, 2, 2, 3, 3, 3)
rank(x)
rank(x, ties.method = "first")
rank(x, ties.method = "min")
```
Note that `match.arg()` will automatically throw an error if the value is not in the set:
```{r}
#| error = TRUE
rank(x, ties.method = "middle")
```
It also supports partial matching so that the following code is shorthand for `ties.method = "random"`:
```{r}
rank(x, ties.method = "r")
```
We prefer to avoid partial matching because while it saves a little time writing the code, it makes reading the code less clear.
`rlang::arg_match()` is an alternative to `match.arg()` that doesn't support partial matching.
It instead provides a helpful error message:
```{r}
#| error = TRUE
rank2 <- function(
x,
ties.method = c("average", "first", "last", "random", "max", "min")
) {
ties.method <- rlang::arg_match(ties.method)
rank(x, ties.method = ties.method)
}
rank2(x, ties.method = "r")
# It also provides a suggestion if you misspell the argument
rank2(x, ties.method = "avarage")
```
### Escape hatch
It's sometimes useful to build in an escape hatch from canned strategies.
This allows users to access alternative strategies, and allows for experimentation that can later turn into a official strategies.
One example of such an escape hatch is in name repair, which occurs in many places throughout the tidyverse.
One place you might encounter it is in `tibble()`:
```{r}
#| error: true
tibble::tibble(a = 1, a = 2)
```
Beneath the surface all tidyverse functions that expose some sort of name repair eventually end up calling `vctrs::vec_as_names()`:
```{r}
#| error: true
vctrs::vec_as_names(c("a", "a"), repair = "check_unique")
vctrs::vec_as_names(c("a", "a"), repair = "unique")
vctrs::vec_as_names(c("a", "a"), repair = "unique_quiet")
```
`vec_as_names()` exposes six strategies, but it also allows you to supply a function:
```{r}
vctrs::vec_as_names(c("a", "a"), repair = toupper)
```
### How keep defaults short?
This technique is a best used when the set of possible values is short as otherwise you run the risk of dominating the function spec with this one argument (@sec-defaults-short-and-sweet).
If you have a long list of possibilities, there are three possible solutions:
- Set a single default and supply the possible values to `match.arg()`/`arg_match()`:
```{r}
rank2 <- function(x, ties.method = "average") {
ties.method <- arg_match(
ties.method,
c("average", "first", "last", "random", "max", "min")
)
}
```
- If the values are used by many functions, you can store the options in an exported vector:
```{r}
ties.methods <- c("average", "first", "last", "random", "max", "min")
rank2 <- function(x, ties.method = ties.methods) {
ties.method <- arg_match(ties.method)
}
```
For example `stats::p.adjust()`, `stats::pairwise.prop.test()`, `stats::pairwise.t.test()`, `stats::pairwise.wilcox.test()` all use `p.adjust.method = p.adjust.methods`.
- You can store the options in a exported named list[^enumerate-options-2].
That has the advantage that you can advertise both the source of the values, and the defaults, and the user gets a nice auto-complete of the possible values.
```{r}
library(rlang)
ties <- as.list(set_names(c("average", "first", "last", "random", "max", "min")))
rank2 <- function(x, ties.method = ties$average) {
ties.method <- arg_match(ties.method, names(ties))
}
```
[^enumerate-options-2]: Thanks to Brandon Loudermilk