forked from tidyverse/style
-
Notifications
You must be signed in to change notification settings - Fork 0
/
syntax.Rmd
executable file
·421 lines (321 loc) · 8.19 KB
/
syntax.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
# Syntax
## Object names
> "There are only two hard things in Computer Science: cache invalidation and
> naming things."
>
> --- Phil Karlton
Variable and function names should use only lowercase letters, numbers, and `_`.
Use underscores (`_`) (so called snake case) to separate words within a name.
```{r, eval = FALSE}
# Good
day_one
day_1
# Bad
DayOne
dayone
```
Base R uses dots in function names (`contrib.url()`) and class names
(`data.frame`), but it's better to reserve dots exclusively for the S3 object
system. In S3, methods are given the name `function.class`; if you also use
`.` in function and class names, you end up with confusing methods like
`as.data.frame.data.frame()`.
If you find yourself attempting to cram data into variable names (e.g. `model_2018`, `model_2019`, `model_2020`), consider using a list or data frame instead.
Generally, variable names should be nouns and function names should be verbs.
Strive for names that are concise and meaningful (this is not easy!).
```{r, eval = FALSE}
# Good
day_one
# Bad
first_day_of_the_month
djm1
```
Where possible, avoid re-using names of common functions and variables. This
will cause confusion for the readers of your code.
```{r, eval = FALSE}
# Bad
T <- FALSE
c <- 10
mean <- function(x) sum(x)
```
## Spacing
### Commas
Always put a space after a comma, never before, just like in regular English.
```{r, eval = FALSE}
# Good
x[, 1]
# Bad
x[,1]
x[ ,1]
x[ , 1]
```
### Parentheses
Do not put spaces inside or outside parentheses for regular function calls.
```{r, eval = FALSE}
# Good
mean(x, na.rm = TRUE)
# Bad
mean (x, na.rm = TRUE)
mean( x, na.rm = TRUE )
```
Place a space before and after `()` when used with `if`, `for`, or `while`.
```{r, eval = FALSE}
# Good
if (debug) {
show(x)
}
# Bad
if(debug){
show(x)
}
```
Place a space after `()` used for function arguments:
```{r, eval = FALSE}
# Good
function(x) {}
# Bad
function (x) {}
function(x){}
```
### Infix operators
Most infix operators (`==`, `+`, `-`, `<-`, etc.) should always be surrounded by
spaces:
```{r, eval = FALSE}
# Good
height <- (feet * 12) + inches
mean(x, na.rm = 10)
# Bad
height<-feet*12+inches
mean(x, na.rm=10)
```
There are a few exceptions, which should never be surrounded by spaces:
* The operators with [high precedence][syntax]: `::`, `:::`, `$`, `@`, `[`,
`[[`, `^`, unary `-`, unary `+`, and `:`.
```{r, eval = FALSE}
# Good
sqrt(x^2 + y^2)
df$z
x <- 1:10
# Bad
sqrt(x ^ 2 + y ^ 2)
df $ z
x <- 1 : 10
```
* Single-sided formulas when the right-hand side is a single identifier:
```{r, eval = FALSE}
# Good
~foo
tribble(
~col1, ~col2,
"a", "b"
)
# Bad
~ foo
tribble(
~ col1, ~ col2,
"a", "b"
)
```
Note that single-sided formulas with a complex right-hand side do need a space:
```{r, eval = FALSE}
# Good
~ .x + .y
# Bad
~.x + .y
```
* When used in tidy evaluation `!!` (bang-bang) and `!!!` (bang-bang-bang)
(because have precedence equivalent to unary `-`/`+`)
```{r, eval = FALSE}
# Good
call(!!xyz)
# Bad
call(!! xyz)
call( !! xyz)
call(! !xyz)
```
* The help operator
```{r, eval = FALSE}
# Good
package?stats
?mean
# Bad
package ? stats
? mean
```
### Extra spaces
Adding extra spaces ok if it improves alignment of `=` or `<-`.
```{r, eval = FALSE}
# Good
list(
total = a + b + c,
mean = (a + b + c) / n
)
# Also fine
list(
total = a + b + c,
mean = (a + b + c) / n
)
```
Do not add extra spaces to places where space is not usually allowed.
## Argument names
A function's arguments typically fall into two broad categories: one supplies
the __data__ to compute on; the other controls the __details__ of computation.
When you call a function, you typically omit the names of data arguments,
because they are used so commonly. If you override the default value of an
argument, use the full name:
```{r, eval = FALSE}
# Good
mean(1:10, na.rm = TRUE)
# Bad
mean(x = 1:10, , FALSE)
mean(, TRUE, x = c(1:10, NA))
```
Avoid partial matching.
## Code blocks {#indenting}
Curly braces, `{}`, define the most important hierarchy of R code. To make this
hierarchy easy to see:
* `{` should be the last character on the line. Related code (e.g., an `if`
clause, a function declaration, a trailing comma, ...) must be on the same
line as the opening brace.
* The contents should be indented by two spaces.
* `}` should be the first character on the line.
```{r, eval = FALSE}
# Good
if (y < 0 && debug) {
message("y is negative")
}
if (y == 0) {
if (x > 0) {
log(x)
} else {
message("x is negative or zero")
}
} else {
y^x
}
test_that("call1 returns an ordered factor", {
expect_s3_class(call1(x, y), c("factor", "ordered"))
})
tryCatch(
{
x <- scan()
cat("Total: ", sum(x), "\n", sep = "")
},
interrupt = function(e) {
message("Aborted by user")
}
)
# Bad
if (y < 0 && debug) {
message("Y is negative")
}
if (y == 0)
{
if (x > 0) {
log(x)
} else {
message("x is negative or zero")
}
} else { y ^ x }
```
### Inline statements {#inline-statements}
It's ok to drop the curly braces for very simple statements that fit on one line, as long as they don't have side-effects.
```{r}
# Good
y <- 10
x <- if (y < 20) "Too low" else "Too high"
```
Function calls that affect control flow (like `return()`, `stop()` or `continue`) should always go in their own `{}` block:
```{r}
# Good
if (y < 0) {
stop("Y is negative")
}
find_abs <- function(x) {
if (x > 0) {
return(x)
}
x * -1
}
# Bad
if (y < 0) stop("Y is negative")
if (y < 0)
stop("Y is negative")
find_abs <- function(x) {
if (x > 0) return(x)
x * -1
}
```
## Long lines
Strive to limit your code to 80 characters per line. This fits comfortably on a
printed page with a reasonably sized font. If you find yourself running out of
room, this is a good indication that you should encapsulate some of the work in
a separate function.
If a function call is too long to fit on a single line, use one line each for
the function name, each argument, and the closing `)`.
This makes the code easier to read and to change later.
```{r, eval = FALSE}
# Good
do_something_very_complicated(
something = "that",
requires = many,
arguments = "some of which may be long"
)
# Bad
do_something_very_complicated("that", requires, many, arguments,
"some of which may be long"
)
```
As described under [Argument names], you can omit the argument names
for very common arguments (i.e. for arguments that are used in almost every
invocation of the function). Short unnamed arguments can also go on the same
line as the function name, even if the whole function call spans multiple lines.
```{r, eval = FALSE}
map(x, f,
extra_argument_a = 10,
extra_argument_b = c(1, 43, 390, 210209)
)
```
You may also place several arguments on the same line if they are closely
related to each other, e.g., strings in calls to `paste()` or `stop()`. When
building strings, where possible match one line of code to one line of output.
```{r, eval = FALSE}
# Good
paste0(
"Requirement: ", requires, "\n",
"Result: ", result, "\n"
)
# Bad
paste0(
"Requirement: ", requires,
"\n", "Result: ",
result, "\n")
```
## Assignment
Use `<-`, not `=`, for assignment.
```{r}
# Good
x <- 5
# Bad
x = 5
```
## Semicolons
Don't put `;` at the end of a line, and don't use `;` to put multiple commands
on one line.
## Quotes
Use `"`, not `'`, for quoting text. The only exception is when the text already
contains double quotes and no single quotes.
```{r, eval=FALSE}
# Good
"Text"
'Text with "quotes"'
'<a href="http://style.tidyverse.org">A link</a>'
# Bad
'Text'
'Text with "double" and \'single\' quotes'
```
## Comments
In data analysis code, use comments to record important findings and analysis
decisions. If you need comments to explain what your code is doing, consider
rewriting your code to be clearer. If you discover that you have more comments
than code, considering switching to RMarkdown.
[syntax]:http://stat.ethz.ch/R-manual/R-patched/library/base/html/Syntax.html