Skip to content
This repository has been archived by the owner on Sep 21, 2019. It is now read-only.
/ harvester-FGP Public archive

The Open Government Secretariat's (OGS) Open Maps (OM) harvester pulling from the Federal Geospatial Platform (FGP)

Notifications You must be signed in to change notification settings

open-data/harvester-FGP

Repository files navigation

Open Maps - Federal Geospatial Platform Harvester

The Open Government Secretariat's (OGS) Open Maps (OM) harvester pulling from the Federal Geospatial Platform (FGP) developed and maintained at Statistics Canada (StatCan).

Harvester - FGP - Diagram

harvest_hnap.py

Extract HNAP XML from the CSW source. Prints xml out to be piped to another command or to a file.

./harvest_hnap.py > hnap.xml
or
./harvest_hnap.py | parsing_command

Presently extracts everything but will eventually extract a window of data (e.g.: metadata records updated in the last two weeks). The alternate time filtering request available and commended out in the script.

This process runs in a few seconds depending on network latency.

hnap2cc-json.py

Converts HNAP XML file to a Common Core mapped CKAN compliant JSON Lines file. Accepts streamed in or file path as an argument and prints out JSON Lines output.

./harvest_hnap.py | ./hnap2json.py > CommonCore_CKAN.jsonl
or
cat hnap.xml | ./hnap2json.py > CommonCore_CKAN.jsonl
or
./hnap2json.py hnap.xml > CommonCore_CKAN.jsonl 

This process runs in a couple seconds.

Import to CKAN

Uploading the JSON Lines file has been tested with the ckanapi CLI

ckanapi load datasets -I CommonCore_CKAN.jsonl -r http://target.ckan.instance.ca/ -a <user api key>

This process runs, depending on how much data is being pushed, in under 20 seconds.

Timing

Since each of these commands totalled run in under a minute this process could safely cycle every 5 minutes but considering how the GeoNetwork uploads in batches (and other departments might too) we should be more careful.

From a process standpoint, for R1 daily or weekly is reasonable. We’ll start assuming weekly till we hear otherwise.

About

The Open Government Secretariat's (OGS) Open Maps (OM) harvester pulling from the Federal Geospatial Platform (FGP)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published