forked from edzer/sdsr
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path08-Plotting.qmd
538 lines (461 loc) · 21.6 KB
/
08-Plotting.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
# Plotting spatial data {#sec-plotting}
\index{maps!plotting}
Together with timelines, maps belong to the most powerful graphs,
perhaps because we can immediately relate to where we are, or
once have been, on the space of the plot. Two recent books on
visualisation [@Healy; @Wilke] contain chapters on visualising
geospatial data or maps. Here, we will not try to point out which
maps are good and which are bad, but rather a number of
possibilities for creating them, challenges along
the way, and possible ways to mitigate them.
## Every plot is a projection {#sec-transform}
\index{maps!projections}
The world is round, but plotting devices are flat. As mentioned
in @sec-projections, any time we visualise, in any
way, the world on a flat device, we project: we convert ellipsoidal
coordinates into Cartesian coordinates. This includes the cases where
we think we "do nothing" as in @fig-world (left), or where we show
the world "as it is", as one would see it from space (@fig-world, right).
```{r fig-world, echo = !knitr::is_latex_output(), message = FALSE}
#| fig.cap: "Earth country boundaries; left: mapping long/lat linearly to $x$ and $y$ (plate carrée); right: as seen from an infinite distance (orthographic)"
#| out.width: 90%
library(sf)
library(rnaturalearth)
w <- ne_countries(scale = "medium", returnclass = "sf")
suppressWarnings(st_crs(w) <- st_crs('OGC:CRS84'))
layout(matrix(1:2, 1, 2), c(2,1))
par(mar = rep(0, 4))
plot(st_geometry(w))
# sphere:
library(s2)
g <- as_s2_geography(TRUE) # Earth
co <- s2_data_countries()
oc <- s2_difference(g, s2_union_agg(co)) # oceans
b <- s2_buffer_cells(as_s2_geography("POINT(-30 -10)"), 9800000) # visible half
i <- s2_intersection(b, oc) # visible ocean
co <- s2_intersection(b, co)
plot(st_transform(st_as_sfc(i), "+proj=ortho +lat_0=-10 +lon_0=-30"), col = 'lightblue')
plot(st_transform(st_as_sfc(co), "+proj=ortho +lat_0=-10 +lon_0=-30"), col = NA, add = TRUE)
```
\newpage
The left plot of @fig-world was obtained by
```{r eval=FALSE}
library(sf)
library(rnaturalearth)
w <- ne_countries(scale = "medium", returnclass = "sf")
plot(st_geometry(w))
```
indicating that this is the default projection for global data with ellipsoidal coordinates:
```{r}
st_is_longlat(w)
```
The projection taken in @fig-world (left) is the equirectangular
(or equidistant cylindrical) projection, which maps longitude and
latitude linearly to the $x$- and $y$-axis, keeping an aspect ratio of 1.
If we would do this for smaller areas not on the equator, then it
would make sense to choose a plot ratio such that one distance unit E-W
equals one distance unit N-S at the centre of the plotted area,
and this is the default behaviour of the `plot` method for
unprojected `sf` or `stars` datasets, as well as the default for
`ggplot2::geom_sf` (@sec-geomsf).
We can also carry out this projection before plotting. Say we want to
plot Germany, then after loading the (rough) country outline,
we use `st_transform` to project:
```{r}
DE <- st_geometry(ne_countries(country = "germany",
returnclass = "sf"))
DE |> st_transform("+proj=eqc +lat_ts=51.14 +lon_0=90w") ->
DE.eqc
```
Here, `eqc` refers to the "equidistant cylindrical" projection of PROJ.
The projection parameter here is `lat_ts`, the latitude of _true
scale_, where one length unit N-S equals one length unit E-W.
This was chosen at the middle of the bounding box latitudes
```{r, echo=!knitr::is_latex_output()}
#| code-fold: true
print(mean(st_bbox(DE)[c("ymin", "ymax")]), digits = 4)
```
We plot both maps in @fig-eqc, and they look identical up to the values
along the axes: degrees for ellipsoidal (left) and metres for
projected (Cartesian, right) coordinates.
```{r fig-eqc, echo=!knitr::is_latex_output()}
#| fig.height: 4.5
#| out.width: 70%
#| code-fold: true
#| fig.cap: "Germany in equirectangular projection: with axis units degrees (left) and metres in the equidistant cylindrical projection (right)"
par(mfrow = c(1, 2), mar = c(2.2, 2.2, 0.3, 0.5))
plot(DE, axes = TRUE)
plot(DE.eqc, axes = TRUE)
```
### What is a good projection for my data?
\index{projection!properties}
There is unfortunately no silver bullet here. Projections that
maintain all distances do not exist; only globes do. The most
used projections try to preserve:
* areas (equal area)
* directions (conformal, such as _Mercator_)
* some properties of distances (_equirectangular_ preserves distances along meridians, _azimuthal equidistant_ preserves distances to a central point)
or some compromise of these. Parameters of projections decide what
is shown in the centre of a map and what is shown on the fringes, which
areas are up and which are down, and which areas are most enlarged.
All these choices are in the end political decisions.
It is often entertaining and at times educational to play around with
the different projections and understand their consequences. When
the primary purpose of the map however is not to entertain or educate
projection varieties, it may be preferable to choose a well-known or
less surprising projection and move the discussion which projection
to use to a decision process of its own. For global
maps however, in almost all cases, equal area projections are
preferred over plate carrée or web Mercator projections.
## Plotting points, lines, polygons, grid cells
\index{maps!plotting detail}
Since maps are just a special form of plots of statistical data,
the usual rules hold. Frequently occurring challenges include:
* polygons may be very small, and vanish when plotted
* depending on the data, polygons for different features may well
overlap, and be visible only partially; using transparent fill
colours may help identify them
* when points are plotted with symbols, they may easily overlap and be hidden; density maps (@sec-pointpatterns) may be more helpful
* lines may be hard to read when coloured and may overlap regardless the line width
### Colours
\index{maps!colours}
When plotting polygons filled with colours, one has the choice to plot
polygon boundaries or to suppress these. If polygon boundaries draw
too much attention, an alternative is to colour them in a grey tone,
or another colour that does not interfere with the fill colours. When
suppressing boundaries entirely, polygons with (nearly) identical
colours will no longer be visually distinguishable. If the property
indicating the fill colour is constant over the region, such as land
cover type, then this is not a problem, but if the property is an
aggregation then the region over which it was aggregated gets lost,
and by that the proper interpretation. Especially for extensive
variables, such as the amount of people living in a polygon, this
strongly misleads. But even with polygon boundaries, using filled
polygons for extensive variables may not be a good idea because
the map colours conflate amount and area size.
The use of continuous colour scales that have no noticeable colour
breaks for continuously varying variables may look attractive,
but is often more fancy than useful:
* it is impracticable to match a colour on the map with a legend value
* colour ramps often stretch non-linearly over the value range,
making it hard to convey magnitude
Only for cases where the identification of values is less
important than the continuity of the map, such as the colouring of
a high resolution digital terrain model, it does serve its goal.
Good colours scales and palettes are found in functions
`hcl.colors` or `palette.colors`, and in packages **RColorBrewer**
[@R-RColorBrewer], **viridis** [@R-viridis], or **colorspace**
[@R-colorspace; @colorspace].
### Colour breaks: `classInt` {#sec-classintervals}
\index{maps!colour breaks}
\index{colour breaks}
When plotting continuous geometry attributes using a limited set
of colours (or symbols), classes need to be made from the data.
R package **classInt** [@R-classInt] provides a number of methods to
do so. The default method is "quantile":
```{r}
library(classInt)
# set.seed(1) if needed ?
r <- rnorm(100)
(cI <- classIntervals(r))
cI$brks
```
it takes argument `n` for the number of intervals, and a `style`
that can be one of "fixed", "sd", "equal", "pretty", "quantile",
"kmeans", "hclust", "bclust", "fisher" or "jenks". Style "pretty"
may not obey `n`; if `n` is missing, `nclass.Sturges` is used;
two other methods are available for choosing `n` automatically. If
the number of observations is greater than 3000, a 10\% sample is used
to create the breaks for "fisher" and "jenks".
### Graticule and other navigation aids {#sec-graticule}
\index{maps!graticule}
\index{graticule}
A graticule is a network of lines on a map that follow constant
latitude or longitude. @fig-first-map shows a graticule drawn in
grey, on @fig-firstgather it is white. Graticules are often drawn in
maps to give place reference. In our first map in @fig-first-map we
can read that the area plotted is near 35$^o$ North and 80$^o$ West.
Had we plotted the lines in the projected coordinate system, they
would have been straight and their actual numbers would not have
been very informative, apart from giving an interpretation of size
or distances when the unit is known, and familiar to the map reader.
Graticules also shed light on which projection was used:
equirectangular or Mercator projections have straight vertical and
horizontal lines, conic projections have straight but diverging
meridians, and equal area projections may have curved meridians.
\newpage
On @fig-world and most other maps the real navigation aid comes from
geographical features like the state outline, country outlines,
coast lines, rivers, roads, railways and so on. If these are added
sparsely and sufficiently, a graticule can as well be omitted. In
such cases, maps look good without axes, tics, and labels, leaving
up a lot of plotting space to be filled with actual map data.
## Base `plot`
\index{sf!plot method}
The `plot` method for `sf` and `stars` objects try to make quick,
useful, exploratory plots; for higher quality plots and more
configurability, alternatives with more control and/or better
defaults are offered for instance by packages **ggplot2** [@R-ggplot2],
**tmap** [@R-tmap; @tmap], or **mapsf** [@R-mapsf].
By default, the plot method tries to plot "all" it is given.
This means that:
* given a geometry only (`sfc`), the geometry is plotted, without colours
* given a geometry and an attribute, the geometry is coloured according to
the values of the attribute, using a qualitative colour scale for `factor`
or `logical` attributes and a continuous scale otherwise, and a colour key
is added
* given multiple attributes, multiple maps are plotted, each with a colour
scale but a key is by default omitted, as colour assignment is
done on a per sub-map basis
* for `stars` objects with multiple attributes, only the first
attribute is plotted; for three-dimensional raster cubes, all
slices over the third dimension are plotted as sub-plots
### Adding to plots with legends
\index{sf!plot!legend}
The `plot` methods for `stars` and `sf` objects may show a colour key
on one of the sides (@fig-first-map). To do this
with `base::plot`, the plot region is split in two and two plots are
created: one with the map, and one with the legend. By default, the
`plot` function resets the graphics device (using `layout(matrix(1))`
so that subsequent plots are not hindered by the device being split
in two, but this prevents adding graphic elements subsequently.
To _add_ to an existing plot with a colour legend, the device reset
needs to be prevented by using `reset = FALSE` in the `plot`
command, and using `add = TRUE` in subsequent calls to `plot`.
An example is
```{r fig-figreset}
#| fig.cap: "Annotating base plots with a legend"
library(sf)
nc <- read_sf(system.file("gpkg/nc.gpkg", package = "sf"))
plot(nc["BIR74"], reset = FALSE, key.pos = 4)
plot(st_buffer(nc[1,1], units::set_units(10, km)), col = 'NA',
border = 'red', lwd = 2, add = TRUE)
```
which is shown in @fig-figreset. Annotating `stars`
plots can be done in the same way when a _single_ stars layer is
shown. Annotating `stars` facet plots with multiple cube slices can be
done by adding a "hook" function that will be called on every slice
shown, as in
```{r fig-starshook}
#| fig.cap: "Annotated multi-slice stars plot"
library(stars)
system.file("tif/L7_ETMs.tif", package = "stars") |>
read_stars() -> r
st_bbox(r) |> st_as_sfc() |> st_sample(5) |>
st_buffer(300) -> circ
hook <- function() {
plot(circ, col = NA, border = 'yellow', add = TRUE)
}
plot(r, hook = hook, key.pos = 4)
```
and as shown in @fig-starshook. Hook functions have access to facet
parameters, facet label and bounding box.
Base plot methods have access to the resolution of the screen device,
and hence the base plot method for `stars` and `stars_proxy` object
will downsample dense rasters and only plot pixels at a density
that makes sense for the device available.
### Projections in base plots
The base `plot` method plots data with ellipsoidal coordinates
using the equirectangular projection, using a latitude parameter
equal to the middle latitude of the data bounding box
(@fig-eqc). To control this parameter, either a projection to
another equirectangular can be applied before plotting, or the
parameter `asp` can be set to override: `asp=1` would lead to
plate carrée (@fig-world) left. Subsequent plots need to be in
the same coordinate reference system in order to make sense with
over-plotting; this is not being checked.
### Colours and colour breaks
In base plots, argument `nbreaks` can be used to set the number of
colour breaks and argument `breaks` either to the numeric vector
with actual breaks, or to a style value for the `style` argument in
`classInt::classIntervals`.
## Maps with `ggplot2` {#sec-geomsf}
\index{ggplot2}
\index{geom\_sf}
Package **ggplot2** [@R-ggplot2; @ggplot2] can create
more complex and nicer looking graphs; it has a geometry `geom_sf`
that was developed in conjunction with the development of `sf` and
helps creating beautiful maps. An introduction to this is found in
@moreno. A first example is shown in @fig-firstgather.
The code used for this plot is:
```{r}
library(tidyverse) |> suppressPackageStartupMessages()
nc.32119 <- st_transform(nc, 32119)
year_labels <-
c("SID74" = "1974 - 1978", "SID79" = "1979 - 1984")
nc.32119 |> select(SID74, SID79) |>
pivot_longer(starts_with("SID")) -> nc_longer
```
```{r eval = FALSE}
ggplot() + geom_sf(data = nc_longer, aes(fill = value), linewidth = 0.4) +
facet_wrap(~ name, ncol = 1,
labeller = labeller(name = year_labels)) +
scale_y_continuous(breaks = 34:36) +
scale_fill_gradientn(colours = sf.colors(20)) +
theme(panel.grid.major = element_line(colour = "white"))
```
where we see that two attributes had to be stacked (`pivot_longer`)
before plotting them as facets: this is the idea behind "tidy" data,
and the `pivot_longer` method for `sf` objects automatically stacks
the geometry column too.
Because `ggplot2` creates graphics _objects_ before plotting them,
it can control the coordinate reference system of all elements
involved, and will transform or convert all subsequent objects to
the coordinate reference system of the first. It will also draw a
graticule for the (default) thin white lines on a grey background,
and uses a datum (by default: WGS84) for this. `geom_sf` can be
combined with other geoms, for instance to allow for annotating
plots.
\index{geom\_stars}
For package **stars**, a `geom_stars` has, at the moment of writing
this, rather limited scope: it uses `geom_sf` for map layout and vector data
cubes, and adds `geom_raster` for regular rasters and `geom_rect`
for rectilinear rasters. It downsamples if the user specifies a
downsampling rate, but has no access to the screen dimensions to
automatically choose a downsampling rate. This may be just enough,
for instance @fig-ggplotstars is created by the following commands:
```{r fig-ggplotstars}
#| fig.cap: "Simple facet raster plot with `ggplot2` and `geom_stars`"
library(ggplot2)
library(stars)
r <- read_stars(system.file("tif/L7_ETMs.tif", package = "stars"))
ggplot() + geom_stars(data = r) +
facet_wrap(~band) + coord_equal() +
theme_void() +
scale_x_discrete(expand = c(0,0)) +
scale_y_discrete(expand = c(0,0)) +
scale_fill_viridis_c()
```
More elaborate `ggplot2`-based plots with `stars` objects may be
obtained using package **ggspatial** [@R-ggspatial]. Non-compatible
but nevertheless `ggplot2`-style plots can be created with `tmap`,
a package dedicated to creating high quality maps (@sec-tmap).
\index[function]{coord\_sf}
When combining several feature sets with varying coordinate reference
systems, using `geom_sf`, all sets are transformed to the reference
system of the first set. To further control the "base"
coordinate reference system, `coord_sf` can be used. This allows
for instance working in a projected system, while combining graphics
elements that are _not_ `sf` objects but regular `data.frame`s
with ellipsoidal coordinates associated to WGS84.
::: {.content-visible when-format="html"}
A twitter thread by Claus Wilke illustrating this is found
[here](https://twitter.com/ClausWilke/status/1275938314055561216).
:::
<!---
FIXME: markup visible in PDF and HTML rendering
--->
## Maps with `tmap` {#sec-tmap}
\index{tmap}
Package **tmap** [@R-tmap; @tmap] takes a fresh look at plotting
spatial data in R. It resembles `ggplot2` in the sense that it
composes graphics objects before printing by building on the `grid`
package, and by concatenating map elements with a `+` between them,
but otherwise it is entirely independent from, and incompatible
with, `ggplot2`. It has a number of options that allow for highly
professional looking maps, and many defaults have been carefully
chosen. Creating a map with two similar attributes can be done
using `tm_polygons` with two attributes, we can use
```{r echo=TRUE, eval=FALSE}
library(tmap)
system.file("gpkg/nc.gpkg", package = "sf") |>
read_sf() |> st_transform('EPSG:32119') -> nc.32119
tm_shape(nc.32119) +
tm_polygons(c("SID74", "SID79"), title = "SIDS") +
tm_layout(legend.outside = TRUE,
panel.labels = c("1974-78", "1979-84")) +
tm_facets(free.scales=FALSE)
```
to create @fig-tmapnc:
```{r fig-tmapnc, echo =!knitr::is_latex_output()}
#| fig.cap: "**tmap**: using `tm_polygons()` with two attribute names"
#| code-fold: true
library(tmap)
system.file("gpkg/nc.gpkg", package = "sf") |>
read_sf() |>
st_transform('EPSG:32119') -> nc.32119
tm_shape(nc.32119) + tm_polygons(c("SID74", "SID79"), title="SIDS") +
tm_layout(legend.outside=TRUE, panel.labels=c("1974-78", "1979-84")) +
tm_facets(free.scales=FALSE)
```
<!---
```{r eval=FALSE}
nc_longer <- nc.32119 |> select(SID74, SID79) |>
pivot_longer(starts_with("SID"), values_to = "SID")
tm_shape(nc_longer) + tm_polygons("SID") +
tm_facets(by = "name")
```
```{r fig-tmapnc2, echo = FALSE}
#| fig.cap: "**tmap**: Using `tm_facets()` on a long table"
nc.32119 |> select(SID74, SID79) |>
pivot_longer(starts_with("SID"), values_to = "SID") -> nc_longer
tm_shape(nc_longer) + tm_polygons("SID") + tm_facets(by = "name")
```
to create @fig-tmapnc2.
-->
Alternatively, from the long table form obtained by `pivot_longer`
one could use `+ tm_polygons("SID") + tm_facets(by = "name")`.
Package **tmap** also has support for `stars` objects, an example created with
```{r, eval = FALSE}
tm_shape(r) + tm_raster()
```
```{r fig-tmapstars, echo = FALSE}
#| fig.cap: "Simple raster plot with tmap"
tm_shape(r) + tm_raster()
```
is shown in @fig-tmapstars. More examples of the use of **tmap**
are given in Chapters [-@sec-area]-[-@sec-spatglmm].
## Interactive maps: `leaflet`, `mapview`, `tmap`
\index{leaflet}
\index{leaflet!tmap}
\index{leaflet!mapview}
\index{mapview}
\index{tmap!interactive views}
::: {.content-visible when-format="html"}
Interactive maps as shown in @fig-mapviewfigure can be
created with R packages **leaflet**, **mapview**, or **tmap**. Package **mapview**
adds a number of capabilities to **leaflet** including a map legend,
configurable pop-up windows when clicking features, support for
raster data, and scalable maps with very large feature sets using
the FlatGeobuf file format, as well as facet maps that react
synchronously to zoom and pan actions. Package **tmap** has the
option that after giving
```{r eval=FALSE}
tmap_mode("view")
```
all usual `tmap` commands are applied to an interactive html/leaflet widget,
whereas after
```{r eval=FALSE}
tmap_mode("plot")
```
again all output is sent to R's own (static) graphics device.
:::
::: {.content-visible when-format="pdf"}
Interactive maps as shown in @fig-mapviewfigurepdf can be
created with R packages **leaflet**, **mapview** or **tmap**. **mapview**
adds a number of capabilities to **leaflet** including a map legend,
configurable pop-up windows when clicking features, support for
raster data, and scalable maps with very large feature sets using
the FlatGeobuf file format, as well as facet maps that react
synchronously to zoom and pan actions. Package **tmap** has the
option that after giving
```{r eval=FALSE}
tmap_mode("view")
```
all usual `tmap` commands are applied to an interactive html/leaflet widget,
whereas after
```{r eval=FALSE}
tmap_mode("plot")
```
again all output is sent to R's own (static) graphics device.
:::
## Exercises
1. For the countries Indonesia and Canada, create individual plots using
equirectangular, orthographic, and Lambert equal area projections, while
choosing projection parameters sensible for the area.
1. Recreate the plot in @fig-figreset with **ggplot2** and with **tmap**.
1. Recreate the plot in @fig-tmapstars using the `viridis` colour ramp.
1. View the interactive plot in @fig-tmapstars using the "view"
(interactive) mode of `tmap`, and explore which interactions are possible; also
explore adding `+ tm_facets(as.layers=TRUE)` and try switching layers on and off.
Try also setting a transparency value to 0.5.