forked from JuliaEarth/geospatial-data-science-with-julia
-
Notifications
You must be signed in to change notification settings - Fork 0
/
03-geoio.qmd
160 lines (118 loc) · 5.54 KB
/
03-geoio.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
# Interfacing with GIS
```{julia}
#| echo: false
#| output: false
import Pkg
Pkg.activate(".")
using GeoStats
import CairoMakie as Mke
```
In order to disrupt existing practices and really develop
something new in Julia, we had to make some hard decisions
along the way. One of these decisions relates to how we are
willing to interface our framework with existing GIS
standards and workflows.
On the one hand, we could have followed the path that was
followed by other communities such as Python and R, and
focus our energy interfacing with well-tested GIS libraries
written in C/C++ (e.g., [GDAL](https://gdal.org/index.html),
[GEOS](https://libgeos.org)). This is precisely what the
JuliaGeo organization has been doing over the years, and it
is an important agenda to bring people from other languages
that are used to the [OGC](https://www.ogc.org/standards)
standards.
On the other hand, we have young geoscientists and first-time
programmers who have never studied GIS before, and who *really
struggle* learning the technology as it is today. The widespread
emphasis on machine representation and software engineering
has created a gap between the developers and the users of GIS
software. A typical gap the Julia programming language helps to close.
We decided to limit our interface with existing GIS technology to
input and output (IO) of files. This gives users of the framework
the chance to
1. Import geospatial data stored as simple features
2. Perform geospatial data science with a rich set of tools
3. Export results to widely used software (e.g., [QGIS](https://qgis.org),
[ArcGIS](https://www.arcgis.com/index.html))
It creates an ecosystem where users can become contributors and
maintainers of the framework, without any knowledge of a second
programming language.
## GeoIO.jl
The [GeoIO.jl](https://github.com/JuliaEarth/GeoIO.jl) module can load
and save geospatial data on disk in a variety of formats, including the
most popular formats in GIS (e.g., `.shp`, `.geojson`, `.kml`, `.parquet`)
thanks to various backend packages spread across various Julia organizations.
It is designed for users who just want to get their data ready for
geospatial data science.
To load a file from disk, we use `GeoIO.load`:
```julia
using GeoIO
geotable = GeoIO.load("file.shp")
```
The function automatically selects the backend based on the file extension,
converts the simple features into a geospatial domain, and returns a `GeoTable`.
To save the `GeoTable` to disk, possibly in a different format, we use `GeoIO.save`:
```julia
GeoIO.save("file.geojson", geotable)
```
The module fixes inconsistencies between formats whenever possible. For
example, the GeoJSON format writes `Date` columns as `String` because
the JSON format has no date types. The Shapefile format has its own
limitations, etc.
Over time, we expect to improve the ecosystem as a whole by highlighting
various issues with available standards and backend implementations.
## File formats
Most GIS file formats do **not** preserve topological information.
This means that neighborhood information is lost as soon as geometries
are saved to disk. To illustrate this issue, we consider a geotable over
a grid:
```{julia}
using GeoIO
geotable = GeoIO.load("data/earth.tif")
```
If we save the geotable to a `.geojson` file on disk, and then load it back,
we observe that the `CartesianGrid` gets replaced by a `GeometrySet`:
```{julia}
fname = tempname() * ".geojson"
GeoIO.save(fname, geotable)
GeoIO.load(fname)
```
Other file formats such as `.ply` and `.msh` are widely used in
[computer graphics](https://en.wikipedia.org/wiki/Computer_graphics)
to save geospatial data over meshes, and preserve topological information:
```{julia}
beethoven = GeoIO.load("data/beethoven.ply")
viz(beethoven.geometry)
```
## Rationale
Now that we have set expectations for our interface with GIS,
let's address an important question that many readers might have coming
from other communities:
> Do we gain anything by not adhering to **programming interfaces**?
The answer is an emphatic **YES**! It means that we have total freedom to innovate
and improve the representation of various geometries and geospatial domains with
Julia's amazing type system. To give a simple example, let's take a look at the
`Triangle` geometry:
```{julia}
t = Triangle((0, 0), (1, 0), (1, 1))
```
If we treated this geometry as a generic polygon represented by a vector of vertices
in memory, like it is done in [GeoInterface.jl](https://github.com/JuliaGeo/GeoInterface.jl)
for example, we wouldn't be able to dispatch optimized code that is only valid for a triangle:
```{julia}
@code_llvm isconvex(t)
```
::: {.callout-note}
In Julia, the macro `@code_llvm` shows the underlying code sent to the LLVM compiler.
In this case, the code is the single line `ret i8 1`, which is the instruction to
return the constant integer `1`.
:::
Notice how the `isconvex` function is compiled away to the **constant** `1` (i.e. `true`) when
called on the triangle. The code for a generic polygon is much more complicated and
requires runtime checks that are too expensive to afford, especially in 3D.
Another reason to not adhere to a generic interface is that we can store information
in the geometry types themselves (e.g., coordinate reference system) that is relevant
to design advanced scientific visualization features illustrated in the previous chapter,
and to dispatch specialized algorithms from geodesic geometry.
Having cleared that up, we will now proceed to the last foundational chapter of the book,
which covers the advanced geometric processing features of the framework.