Motivation

We want to download several layers from geoserver every day in order to create snapshots that can be used as geo-temporal data later on.

Requirements

support for GML files only
create shapshots once per day or less frequently
retry download if failed (but only up to some retry limit, e.g max 3 times)
use content from previous day if failed to download (but keep track of it)
the data can potentially be several GB of XML
store the snapshots efficiently (compressed, only diffs ...)
use cron to trigger the download every N minutes
when triggered, only a single GML file should be downloaded at a time
keep low profile - avoid parallel downloads, limit transfer rate
every file is downloaded only once per day

Implementation

we need a URL of the geoserver and a directory DIR where the layers will be downloaded

script download_layer.sh URL DIR:

picks the oldest *.gml file from DIR
downloads the corresponding layer from URL using WFS as GML (XML)
formats the XML files using xmllint --format
removes fid XML attributes (because they are always newly generated by the geoserver)
for every *.gml file a corresponding *.meta file will be generated which contains some accounting information about the download

duplicity:

a useful tool for incremental backups (see usage examples below)

Incremental snapshots using duplicity:

duplicity -vi --allow-source-mismatch --no-encryption path/to/src/dir file://path/to/snapshot/dir

-vi = verbosity level is "info"
--allow-source-mismatch allows that the names of source dirs can be changed

Listing existing snapshots

duplicity colletion-status file://path/to/my/snapshot/dir

Restoring a snapshot

duplicity restore --no-encryption --time 2016-06-30T11:00:00 file://path/to/snapshot/dir path/to/output/dir

Showing summary of differences

Assuming we want to compare differences between directories dir1 and dir2 and that we want to ignore files matching a pattern *.meta:

diff -x '*.meta' dir1 dir2 | diffstats

Output should look like this:

include/net/bluetooth/l2cap.h |    6 ++++++
 net/bluetooth/l2cap.c         |   18 +++++++++---------
 2 files changed, 15 insertions(+), 9 deletions(-)

Rename stuff by removing prefix from filename

Assuming you are in some directory which contains files and the prefix is "PREFIX" (This is just a quick and dirty method, there is certainly a better way to do so)

find . | while read F; do mv $F ${F#./PREFIX}; done

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
.gitignore		.gitignore
README.md		README.md
download_layer.sh		download_layer.sh
get_layers.R		get_layers.R
get_layers.sh		get_layers.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Motivation

Requirements

Implementation

Incremental snapshots using duplicity:

Listing existing snapshots

Restoring a snapshot

Showing summary of differences

Rename stuff by removing prefix from filename

About

Releases

Packages

Contributors 2

Languages

vsimko/geolayerdump

Folders and files

Latest commit

History

Repository files navigation

Motivation

Requirements

Implementation

Incremental snapshots using duplicity:

Listing existing snapshots

Restoring a snapshot

Showing summary of differences

Rename stuff by removing prefix from filename

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages