This project implements a webapp that enables the import of spatial datasets (SpatialUnits, Georesources and Indicators) into the KomMonitor Spatial Data Infrastructure. The webapp provides an API for retrieving data from various data sources that come in the shape of certain formats and converting the datasets into a KomMonitor related schema. The converted datasets come available within the KomMonitor Spatial Data Infrastructure by publishing them via the Data Management API.
- DockerHub repositories of KomMonitor Stack
- Github Repositories of KomMonitor Stack
- Github Wiki for KomMonitor Guidance and central Documentation
- Technical Guidance and Deployment Information for complete KomMonitor stack on Github Wiki
- KomMonitor Website
KomMonitor Importer requires
- a running instance of KomMonitor Data Management, to forward any import/update requests.
- an optional and configurable connection to a running Keycloak server, if role-based data access is activated via configuration of KomMonitor stack
The KomMonitor Importer project comprises different modules that encapsulate different tasks:
This module contains the API model classes which will be used for serialization and deserialization of JSON payload for API calls via Jackson annotations. The model classes were auto-generated from the KomMonitor Importer OpenAPI specs by the use of Swagger Codegen.
The core module provides the main data entities that represent KomMonitor related resources as well as classes that are responsible for importing datasets. In particular, these are:
- DataSourceRetriever implementations for fetching datasets
from a certain datasource
(package: org.n52.kommonitor.importer.io.datasource) - Converter implementations that supports the conversion certain data formats
(package: org.n52.kommonitor.importer.converter).
The above mentioned packages also provide interfaces that can be implemented to extend the project by additional datasource retriever and converters (see: Extend the Importer API).
In addition, the module contains some helper and service classes for:
- performing HTPP requests
- storing uploaded files
- performing common geometrical operations
- dealing with the GeoTools Feature model for decoding properties and geometries
The API module primarily comprises three kinds of classes that are responsible for implementing the API endpoints:
- API interfaces contain request mappings for all API endpoints that are defined by
Spring MVC annotations as well
as Swagger Annotations for generating API
documentations
(package: org.n52.kommonitor.importer.api) - Controller classes, which are annotated with @Controller for enabling Spring's autodetection, implement the API interfaces (package: org.n52.kommonitor.importer.api)
- Handler classes are be used by the Controllers, to delegate the request handling dependent on a particular request
(package: org.n52.kommonitor.importer.api.handler)
Just like the API model classes, the API interfaces were auto-generated from the KomMonitor Importer OpenAPI specs by the use of Swagger Codegen.
This module provides API clients for the Data Management API endpoints. The client implementation is based on Spring RestTemplate and was generated from the KomMonitor Data Management OpenAPI specs via Swagger Codegen. In addition, End-to-End tests for some of the client methods, that are required for the Importer, have been implemented using Spring MockRestServiceServer.
The App module contains the main class that is responsible for launching the Spring Boot application and an application.yml
that provides different properties for externalized configuration. Furthermore, some configuration classes are located
in the module, that utilize application.yml
properties for configuring different Spring Beans that will be injected
within the other modules.
There are some requirements on your building environment in order to build and run the KomMonitor Importer API from source:
- at least Java SE Development Kit 8 must be available
- to build the project from source, Maven is required
- for cloning the repository, Git must be installed
- if you wish to run the application as Docker container, you also need Docker
You can download the latest branch directly from GitHub or if you have installed Git in your environment just run
git clone https://github.com/SebaDro/kommonitor-importer.git
.
After cloning the repository, just run mvn clean install
from the repository's root directory to build the whole
project from source.
KomMonitor Importer API is a Spring Boot application and provides a application.yml
within the
KomMonitor Importer App module for
externalized configurations. The properties within this file can be used to configure the application. A documentation
for common application properties can be found under https://docs.spring.io/spring-boot/docs/current/reference/html/appendix-application-properties.html
Furthermore, the application.yml
contains some additional and custom configuration properties:
kommonitor.importer.datamanagement-api-url:
: endpoint of the KomMonitor Data Management APIkommonitor.importer.fileStorageLocation
: path to the file storage directory that will be used for storing uploaded filesspringfox.documentation.swagger.v2.path
: defines the default context path for retrieving the API documentation
There are different deployment patterns for running the application:
If you have built the project from source, the KomMonitor Importer App has been packaged as JAR artifact. So, just execute
execute the JAR within the target folder of the kommonitor-importer-app
module to run the application.
You can also start the application by running mvn spring-boot:run
from the root
of the kommonitor-importer-app
module.
The repository also contains a Dockerfile for building a Docker image of the application. To build the image,
run docker build -t kommonitor/importer:latest .
from the root of the repository. Finally, a container with published
port 8087 can be started with docker run -p 8087:8087 kommonitor/importer
.
By default, the started application is available under http://localhost:8087.
Only contains subset of whole KomMonitor stack to focus on the config parameters of this component
version: '2.1'
networks:
kommonitor:
driver: bridge
services:
# importer component that can import spatial resources from different data sources (e.g. GeoJSON, CSV, WFS),
# sanity-check them and forward data integration requests to data management component
kommonitor-importer:
image: 'kommonitor/importer'
container_name: kommonitor-importer
#restart: unless-stopped
ports:
- 8087:8087
volumes:
- importer_data:/tmp/importer # storage location where to store "uploaded files"; files can be uploaded to importer, but currently will never be deleted; hence manually delete them if required
environment:
- kommonitor.importer.datamanagement-api-url=http://kommonitor-data-management:8085/management # target URL to running Data Management component ending with "/management" (/management is internal base path of data management component)- best use docker name and port within same network
- JAVA_OPTS=-Dorg.geotools.referencing.forceXY=true # important setting that coordinate system axes shall follow order XY (default is YX, but KomMonitor Data Management component expects axis order XY; e.g. longitude, latitude)
- logging.level.org.n52.kommonitor=ERROR # adjust logging level [e.g. "INFO", "WARN", "ERROR"] - ERROR logs only errors
- KOMMONITOR_SWAGGERUI_BASEPATH= #depending on DNS Routing and Reverse Proxy setup a base path can be set here to access swagger-ui interface (e.g. set '/data-importer' if https://kommonitor-url.de/data-importer works as entry point for localhost:8087)
- KOMMONITOR_SWAGGER_UI_SECURITY_CLIENT_ID=kommonitor-importer # client/resource id of importer component in Keycloak realm
- KOMMONITOR_SWAGGER_UI_SECURITY_SECRET=secret # WARNING: DO NOT SET IN PRODUCTION!!! Keycloak secret of this component within Credentials tab of respective Keycloak client; secret for swagger-ui to authorize swagger-ui requests in a Keycloak-active scenario (mostly this should not be set, as users with access to swagger-ui (e.g. 'http://localhost:8087/swagger-ui.html') could then authorize without own user account and perform CRUD requests)
- KEYCLOAK_ENABLED=false # enable/disable role-based data access using Keycloak (true requires working Keycloak Setup and enforces that all other components must be configured to enable Keycloak as well)
- KEYCLOAK_AUTH_SERVER_URL=https://keycloak.fbg-hsbo.de/auth # Keycloak URL ending with '/auth/'
- KEYCLOAK_REALM=kommonitor # Keycloak realm name
- KEYCLOAK_RESOURCE=kommonitor-importer # client/resource id of importer component in Keycloak realm
- KEYCLOAK_CREDENTIALS_SECRET=secret # Keycloak secret of this component within Credentials tab of respective Keycloak client; must be set here
- SERVER_PORT=8087 # Server port; default is 8087
networks:
- kommonitor
# database container; must use PostGIS database
# database is not required to run in docker - will be configured in Data Management component
kommonitor-db:
image: mdillon/postgis
container_name: kommonitor-db
#restart: unless-stopped
ports:
- 5432:5432
environment:
- POSTGRES_USER=kommonitor # database user (will be created on startup if not exists) - same settings in data management service
- POSTGRES_PASSWORD=kommonitor # database password (will be created on startup if not exists) - same settings in data management service
- POSTGRES_DB=kommonitor_data # database name (will be created on startup if not exists) - same settings in data management service
volumes:
- postgres_data:/var/lib/postgresql/data # persist database data on disk (crucial for compose down calls to let data survive)
networks:
- kommonitor
# Data Management component encapsulating the database access and management as REST service
kommonitor-data-management:
image: kommonitor/data-management
container_name: kommonitor-data-management
#restart: unless-stopped
depends_on:
- kommonitor-db # only if database runs as docker container as well
ports:
- "8085:8085"
networks:
- kommonitor
links:
- kommonitor-db
environment:
# - env parameters omitted here for brevity
volumes:
postgres_data:
importer_data:
The entrypoint for the KomMonitor Importer API is http://localhost:8087. If you call this URL without any additional endpoint, you will be redirected to http://localhost:8087/swagger-ui.html. This page provides a Swagger UI, which has been generated from the OpenAPI specification for visualization and interacting with the API resources. So, feel free to try out the different API endpoints via Swagger UI to get started.
The Importer API supports selected data source types. For each type, the application contains an appropriate implementation that is responsible for accessing the data source and retrieving datasets from it. The API has two endpoints for retrieving information about the supported data source types:
/datasourceTypes
: lists all supported types of a data sourcedatasourceTypes/{type}
: provides detailed information and supported parameters for a certain data source type
Retrieving datasets from a certain data source is only one aspect the Importer API has to deal with. Another one is the ability for parsing the dataset in order to map it to a format the Data Management API can deal with. Hence, the API provides different converters, each one supporting a certain data format. You will find information about the available converters via the following endpoints:
/converters
: lists all available converters and its supported data formats/converters/{name}
: provides detailed information and supported parameters for a certain converter
If you plan to import a dataset that is stored within a file, you first have to upload this file to the server so that it gets accessible for the data source retriever.
The upload is done by performing a POST request with a multi-part message that contains the file against the /upload
endpoint.
Optionally, you can set a custom file name within the multi-part message that will be used for storing the file on the server.
You can retrieve a list of all ever uploaded files by doing a GET request on the /upload
endpoint.
For each resource type of the KomMonitor DataManagement API (Georesources, Spatial Units, Indicators), the Importer API provides an appropriate endpoint that triggers the import process. Within the POST body of an import request you have to define some required information about how to access a certain dataset and how to convert it into the KomMonitor specific schema.
You can trigger the import of Georesources by sending a POST request to the /georesources
endpoint. The request body
has to contain the following properties:
georesourcePostBody
: A JSON object in accordance to the POST request body for the/georesources
endpoint of the Data Management API. Only thegeoJsonString
property must not be set, since its value will be generated as part of the import process. For all other properties, you can find detailed descriptions in the Data Management API documentation.datasource
: Definition of the data source from which new datasets should be imported
(see: Datasource Definition).converter
: Definition of the converter that should be used for converting the imported dataset
(see: Converter Definition).propertyMapping
: Definitions for mapping properties from the imported dataset to required properties for spatial resources
(see: Spatial Resource Property Mapping).dryRun
: Indicates if a dry run import should be performed. Iftrue
the import process will be performed without posting the imported resources to the Data Management API. You should perform a dry run in order to get a preview of those resources that would be imported and possible errors that occur during the import.
The import of Spatial Units is done by sending a POST request to the /spatial-units
endpoint. The request body
has to contain the following properties:
spatialUnitPostBody
: A JSON object in accordance to the POST request body for the/spatialUnits
endpoint of the Data Management API. Only thegeoJsonString
property must not be set, since its value will be generated as part of the import process. For all other properties, you can find detailed descriptions in the Data Management API documentation.datasource
: Definition of the data source from which new datasets should be imported
(see: Datasource Definition).converter
: Definition of the converter that should be used for converting the imported dataset
(see: Converter Definition).propertyMapping
: Definitions for mapping properties from the imported dataset to required properties for spatial resources
(see: Spatial Resource Property Mapping).dryRun
: Indicates if a dry run import should be performed. Iftrue
the import process will be performed without posting the imported resources to the Data Management API. You should perform a dry run in order to get a preview of those resources that would be imported and possible errors that occur during the import.
For importing an Indicator you have to perform a POST request to the /indicators
endpoint. The request body
has to contain the following properties:
indicatorPostBody
: A JSON object in accordance to the POST request body for the/indicators
endpoint of the Data Management API. Only theindicatorValues
property must not be set, since the time series values will be generated as part of the import process. For all other properties, you can find detailed descriptions in the Data Management API documentation.datasource
: Definition of the data source from which new datasets should be imported
(see: Datasource Definition).converter
: Definition of the converter that should be used for converting the imported dataset
(see: Converter Definition).propertyMapping
: Definitions for mapping properties from the imported dataset to required properties for Indicators
(see: Indicator Property Mapping).dryRun
: Indicates if a dry run import should be performed. Iftrue
the import process will be performed without posting the imported resources to the Data Management API. You should perform a dry run in order to get a preview of those resources that would be imported and possible errors that occur during the import.
The response body for a successful import request contains the following parameters:
uri
: If the imported resources were inserted into the Data Management API successfully, this is the uri of the resource. Note, that theuri
parameter won't be included for a dry run.importedFeatures
: The IDs of all imported resources are included in this array.errors
: The messages for all errors, that occur during the import, will be listed within this parameter.
Below listed, you'll find an example for such a response:
{
"uri": "00b462d7-8903-40e9-8222-10f534afcbb6",
"importedFeatures": [
"_170",
"_152",
"_638"
],
"errors": [
"Failed conversion for resource _312. Cause(s): [Property 'dmg_altrstr_drchschnaltr' does not exist.]"
]
}
As you can see from the above sections, you have to declare various properties within the request body that are required for performing the import. The following sections give you some examples of how to define those properties.
The dataSource
property is required within the POST request body for each resource endpoint. It contains information
about the data source type and additional parameters that are required for fetching datasets from it. You'll get an information
of all supported data source types from the /datasourceTypes
endpoint (see: Supported Data Source Types).
As an example, the following snipped shows how to define a HTTP datasource with a URL parameter that will be used for performing a
HTTP GET request:
{
"dataSource": {
"type": "HTTP",
"parameters": [
{
"name": "URL",
"value": "http://www.webgis-server.de/endpoint?SERVICE=WFS&REQUEST=getFeature&VERSION=1.1.0&TypeName=ns0:testType"
}
]
}
}
The converter
property is another mandatory property that has to be defined within the POST request body. It contains
configurations for the converter that should be used for importing a dataset. Ideally, you choose a converter appropriate to
the dataset's format. The /converters
endpoints provides a list of all available converter implementations and its
supported properties like encoding, mimeType, schema and additional properties.
Following, you'll find an example for a converter configuration according to a dataset that comes in the shape of a WFS 1.0.0
schema:
{
"converter": {
"name": "org.n52.kommonitor.importer.converter.wfs.v1",
"encoding": "UTF-8",
"mimeType": "application/xml",
"schema": "http://schemas.opengis.net/wfs/1.0.0/wfs.xsd",
"parameters": [
{
"name": "CRS",
"value": "EPSG:25832"
}
]
}
}
If you wish to get some additional information about the WFS 1.0.0 converter, feel free to call its API endpoint
/converters/org.n52.kommonitor.importer.converter.wfs.v1
. You will notice, that the converter supports multiple schemas
and in addition a CRS
parameter to define the coordinate reference system of the dataset to import. Make sure, you'll know
both in order to define the converter properly for the import request.
As part of the import process, a GeoJSON FeatureCollection will be generated from the imported dataset, that will be used for adding new resources via the Data Management API. This FeatureCollection contains the geometry from the imported dataset, some additional properties, according to the Data Management API schema for those resources and optional a the whole collection of feature attributes or only a subset from it. To tell the Import API which properties from the original dataset should be used for the FeatureCollcetion properties, a property mapping has to be provided. E.g. assume the following GeoJSON dataset:
{
"type": "FeatureCollection",
"name": "Baubloecke",
"features": [
{
"type": "Feature",
"properties": {
"baublock_id": "_170",
"EreignisintervallStart": "2019-05-06",
"EreignisintervallEnde": "2019-05-28",
"altersdurchschnitt": 43.4123,
"ort": "Musterstadt",
"plz": "12345"
},
"geometry": {
"type": "MultiPolygon",
"coordinates": [[[
[384891.959,5713511.4441],
[384898.084,5713509.1401],
[384947.758,5713522.8651],
[385030.73,5713539.248],
[384891.959,5713511.4441]
]]]
}
}
]
}
An appropriate property mapping would be:
{
"propertyMapping": {
"identifierProperty": "baublock_id",
"nameProperty": "baublock_id",
"validStartDateProperty": "EreignisintervallStart",
"validEndDateProperty": "EreignisintervallEnde",
"keepAttributes": false,
"keepMissingOrNullValueAttributes": false,
"attributes": [
{
"name": "ort",
"type": "string"
},
{
"name": "plz",
"mappingName": "postleitzahl",
"type": "string"
}
]
}
}
The first two properties (identifierProperty
and nameProperty
) are specified by the Data Management API schema and
are mandatory.
The following two properties (validStartDate
and validEndDate
) are also specified by the by the Data Management API
schema but optional.
The keepAttributes
property indicates indicates whether to preserve all attributes or not. If true, you can't specify
an alias for the attributes, like you would do for the attribute mappings.
You can define mappings for any attributes under the attributes
property. Here you can also define an alias name
for an attribute by setting a value for the mappingName
property. If the keepAttributes
is true, this property will
be skipped.
In addition, you have to specify if you want to keep missing attributes or attributes that hold a NULL value by setting
the keepMissingOrNullValueAttributes
property. If true, any missing attribute will be added to the converted resource
by setting a NULL value. Attributes that are present but holds a NULL value will be kept anyway. Note, that this property
will be ignored if keepAttributes
ist set to true, since all present attributes will be kept anyway.
Note, that up to now only flat property hierarchies are supported. Nested properties in the original dataset can't be covered with the property mapping, so the import will fail for such a dataset.
The property mapping for Indicators is different to the mapping for spatial features. Since there are different strategies of how to encode time series values for spatial features, the time series mapping also supports different strategies for mapping those values, which will be explained in the following.
Case 1: Related Indicator values for the same time series are encoded within different features
In this case, each single Indicator value of the same time series is encoded as a separate feature. In the example below,
there are two features for the same Spatial Unit. Both features have the same ID and also the same geometry. Only the
properties are different, because each feature only contains the properties for a single timestep of a common time series.
{
"type": "FeatureCollection",
"name": "Baubloecke",
"features": [
{
"type": "Feature",
"properties": {
"baublock_id": "_170",
"date": "2019-05-06",
"altersdurchschnitt": 43.4123
},
"geometry": { }
},
{
"type": "Feature",
"properties": {
"baublock_id": "_170",
"date": "2020-04-23",
"altersdurchschnitt": 48.4123
},
"geometry": { }
}
]
}
For this case, you only have to define a single timeseriesMapping
beside the mapping for the spatialReferenceKey
.
If you provide such a mapping to the Import API, the responsible converter automatically tries to group the features
by its' values for the spatialReferenceKey
, so that it can merge the single Indicator values to a time series for each
spatial feature. Note, that you have to define both, the property that holds the Indicator value and the property that
holds the timestamp information:
{
"propertyMapping": {
"spatialReferenceKeyProperty": "baublock_id",
"keepMissingOrNullValueIndicator": false,
"timeseriesMappings": [
{
"indicatorValueProperty": "altersdurchschnitt",
"timestampProperty": "date"
}
]
}
}
Case 2: Each feature contains the whole time series
This encoding stratetgy for time series values implies, that a single feature has the complete time series for an Indicator
encoded within its properties. For each time step there is a separate property that holds the Indicator value for this
time step. Like in the example below, the property name may contain the timestamp information for an Indicator.
{
"type": "FeatureCollection",
"name": "Baubloecke",
"features": [
{
"type": "Feature",
"properties": {
"baublock_id": "_170",
"altersdurchschnitt2019-05-06": 43.4123,
"altersdurchschnitt2020-04-23": 48.4123
},
"geometry": { }
}
]
}
If the time series is encoded in a way like the example above, you have to provide multiple time series mappings. For each time step you have to define, which property contains the corresponding Indicator value. Note, that in such a case you have to provide the final timestamp value within the mapping, rather than defining the property that holds the timestamp information.
{
"propertyMapping": {
"spatialReferenceKeyProperty": "baublock_id",
"keepMissingOrNullValueIndicator": false,
"timeseriesMappings": [
{
"indicatorValueProperty": "altersdurchschnitt2019-05-06",
"timestamp": "2019-05-06"
},
{
"indicatorValueProperty": "altersdurchschnitt2020-04-23",
"timestamp": "2020-04-23"
}
]
}
}
Similar to the spatial resouce mapping, you can indicate, if missing indicator values should be kept. This could be useful
if a timeseries is not complete and values are missing for some time steps. Just set keepMissingOrNullValueIndicator
to
true, so that missing indicators will be added and indicators with a NULL value will be kept.
The KomMonitor Import API also provides dedicated endpoints for updating existing resources. Just like the simple import, datasets will be imported from a datasource and converted in an appropriate format. The only difference is, that finally a PUT request will be performed on the resources endpoint of the Data Management API, in order to update an existing resource and not creating a new one.
The update of Georesources is done by sending a POST request to the /georesources/update
endpoint. The request body
has to contain the following properties:
georesourceId
: The ID of the existing Georesource within the Data Management APIgeoresourcePutBody
: A JSON object in accordance to the PUT request body for the/georesources
endpoint of the Data Management API. You can find detailed descriptions in the Data Management API documentation.datasource
: Definition of the data source from which existing datasets should be updated (see: Datasource Definition).converter
: Definition of the converter that should be used for converting the imported dataset (see: Converter Definition).propertyMapping
: Definitions for mapping properties from the imported dataset to required properties for spatial resources. (see: Spatial Resource Property Mapping).dryRun
: Indicates if a dry run import should be performed. Iftrue
the import process will be performed without posting the imported resources to the Data Management API. You should perform a dry run in order to get a preview of those resources that would be imported and possible errors that occur during the import.
You can update a Spatial Unit by sending a POST request to the /spatial-units/update
endpoint. The request body
has to contain the following properties:
spatialUnitId
: The ID of the existing SpatialUnit within the Data Management APIspatialUnitPutBody
: A JSON object in accordance to the PUT request body for the/spatial-units
endpoint of the Data Management API. You can find detailed descriptions in the Data Management API documentation.datasource
: Definition of the data source from which existing datasets should be updated (see: Datasource Definition).converter
: Definition of the converter that should be used for converting the imported dataset (see: Converter Definition).propertyMapping
: Definitions for mapping properties from the imported dataset to required properties for spatial resources (see: Spatial Resource Property Mapping).dryRun
: Indicates if a dry run import should be performed. Iftrue
the import process will be performed without posting the imported resources to the Data Management API. You should perform a dry run in order to get a preview of those resources that would be imported and possible errors that occur during the import.
If you want to update an Indicator, you have to send a POST request to the /indicators/update
endpoint. The request body
has to contain the following properties:
indicatorId
: The ID of the existing Indicator within the Data Management APIindicatorPutBody
: A JSON object in accordance to the PUT request body for the/indicators
endpoint of the Data Management API. You can find detailed descriptions in the Data Management API documentation.datasource
: Definition of the data source from which existing datasets should be updated (see: Datasource Definition).converter
: Definition of the converter that should be used for converting the imported dataset (see: Converter Definition).propertyMapping
: Definitions for mapping properties from the imported dataset to required properties for Indicators (see: Indicator Property Mapping)dryRun
: Indicates if a dry run import should be performed. Iftrue
the import process will be performed without posting the imported resources to the Data Management API. You should perform a dry run in order to get a preview of those resources that would be imported and possible errors that occur during the import.
If you want to extend the Importer API, you should have to know about the relevant classes and their relationships with
each other. Therefore, you'll find a simple class diagram below, that shows the most relevant classes:
RequestHandler
There is one generic AbstractRequestHandler
that implements a common handling for incoming API requests. Depending on
the type of the request, a concrete handler implementation handles the request type-specific. E.g. if there is
an incoming request for updating a SpatialUnit the SpatialUnitUpdateHandler
that is bind on a UpdateSpatialUnitPOSTInputType
will be invoked. For importing the requested resource all handler use a certain DatasourceRetriever
and a certain
Converter
that will be provided by repositories.
DataSourceRetriever
Each of the different implementations of the generic DataSourceRetriever
interface is bound to a certain dataset type and
is responsible for retrieving a Dataset
that holds an object of the same type. E.g. the FileRetriever
retrieves
a File
and creates a Dataset
that is bound to the File
object, while the HttpRetriever
does the same
with an InputStream
. Certain DataSourceRetriever
will be provided to the RequestHandler by the
DataSourceRetrieverRepository
dependent on the data source type which is defined by the dataSource
property within
the import POST request body (see: Datasource Definition).
Converter
Certain implementations of the Converter
interface face the conversion of specific data formats. They take a Dataset
which was retrieved by a DataSourceRetriever
and convert the object that is bound to the Dataset
into Indicator
and SpatialResource
objects. Those two entity types will be then used to generate the request body for the POST
request against the Data Management API. Like the DataSourceRetriever
, a certain Converter
implementation will
be provided by a ConverterRepository
to the RequestHandler, dependent on the converter
property definition
within the import POST request body (see: Converter Definition).
The easiest way to implement a DataSourceRetriever
for an additional data source is to extend AbstractDataSourceRetriever
.
Let's have a look on how this could be done by the example of the existing InlineTextRetriever
which aims to
retrieve data sets that are defined 'inline' within an import POST request body.
- Annotate your class with the Spring
@Component
, so that it can be auto-injected within theDataSourceretrieverRepository
@Component
public class InlineTextRetriever extends AbstractDataSourceRetriever<String> {
}
- Implement
initType()
in order to define a unique type that will be later used to identify the requestedDataSourceRetriever
andinitSupportedParameters()
for declaring the supported parameters. For theInlineTextRetriever
only apayload
parameter is necessary, so that the dataset can be declared within the import POST request body as inline value for this property. Note, that for eachDataSourceParameter
, a unique name, a description and a value type has to be defined.
@Component
public class InlineTextRetriever extends AbstractDataSourceRetriever<String> {
private static final String TYPE = "INLINE";
private static final String PARAM_PAYLOAD = "payload";
private static final String PARAM_PAYLOAD_DESC = "The payload as plain text";
@Override
protected String initType() {
return TYPE;
}
@Override
protected Set<DataSourceParameter> initSupportedParameters() {
Set<DataSourceParameter> parameters = new HashSet<>();
DataSourceParameter payloadParam = new DataSourceParameter(PARAM_PAYLOAD, PARAM_PAYLOAD_DESC, DataSourceParameter.ParameterTypeValues.STRING);
parameters.add(payloadParam);
return parameters;
}
...
}
- Implement
retrieveDataset()
that should finally provide aDataset
as a result. For this you have to use the parameter values, that have been defined within the import POST request. Make sure, that each required parameter exists. Otherwise, throw an exception. For theInlineTextRetriever
, you only have to fetch the text content from thepayload
property and return it bound to aDataset
. But other implementations, may require a more complex retrieving strategy. E.g. theHttRetriever
has to request an URL that has been defined as parameter.
@Component
public class InlineTextRetriever extends AbstractDataSourceRetriever<String> {
...
@Override
public Dataset<String> retrieveDataset(DataSourceDefinitionType datasource) throws ImportParameterException {
Optional<String> payload = this.getParameterValue(PARAM_PAYLOAD, datasource.getParameters());
if (!payload.isPresent()) {
throw new ImportParameterException("Missing parameter: " + PARAM_PAYLOAD);
}
return new Dataset<String>(payload.get());
}
}
In order to support additional data formats, you have to implement new converters. Just extend the AbstractConverter
.
As an example, let's assume we want to provide a converter that supports the converting of CSV based datasets.
- The new converter should be registered by the
ConverterRepository
so just annotate your class with@Component
@Component
public class CsvConverter extends AbstractConverter {
}
- Provide various definitions of the supported dataset types. This includes the definition of supported MIME types,
encodings and schemas as well as specific
ConverterParameters
that are required for telling the converter, how to handle a certain dataset. Reasonable parameters for theCsvConverter
would be a separator that is used for separating of the columns and a parameter to define the column that includes the geometries.
@Component
public class CsvConverter extends AbstractConverter {
private static final String NAME = "org.n52.kommonitor.importer.converter.csv";
private static final String PARAM_SEP = "separator";
private static final String PARAM_SEP_DESC = "The separator of the CSV dataset";
private static final String PARAM_GEOM_COL = "geometryColumn";
private static final String PARAM_GEOM_DESC = "The column that contains the geometry as WKT";
@Override
public String initName() {
return NAME;
}
@Override
public Set<String> initSupportedMimeType() {
Set<String> mimeTypes = new HashSet<>();
mimeTypes.add("text/csv");
return mimeTypes;
}
@Override
public Set<String> initSupportedEncoding() {
Set<String> encodings = new HashSet<>();
encodings.add("UTF-8");
return encodings;
}
@Override
public Set<String> initSupportedSchemas() {
return null;
}
@Override
public Set<ConverterParameter> initConverterParameters() {
Set<ConverterParameter> params = new HashSet();
params.add(new ConverterParameter(PARAM_SEP, PARAM_SEP_DESC, ConverterParameter.ParameterTypeValues.STRING));
params.add(new ConverterParameter(PARAM_GEOM_COL, PARAM_GEOM_DESC, ConverterParameter.ParameterTypeValues.STRING));
return params;
}
...
}
- Implement
convertSpatialResources()
for converting aDataset
asSpatialresources
andconvertIndicators()
for converting it toIndicators
. You have to utilize theConverterParameter
values in order to handle theDataset
properly. So first of all, check if all required parameters exists and afterwards fetch there values. Following, you should resolve theDataset
object. For convenience, theAbstractConverter
provides agetInputStream()
method that retrieves anInputStream
from different object types (likeFile
,String
, etc.) that are bind to theDataset
. With thisInputStream
you can start converting your dataset.
If you choose to parse a dataset with the GeoTools framework, which provides several plugins for different formats, you can subsequently use theorg.n52.kommonitor.importer.decoder.FeatureDecoder
which provides several helper methods for converting GeoToolsFeatures
andFeatureCollections
intoSpatialResources
andIndicators
.
@Component
public class CsvConverter extends AbstractConverter {
...
@Override
public List<SpatialResource> convertSpatialResources(ConverterDefinitionType converterDefinition,
Dataset dataset,
SpatialResourcePropertyMappingType propertyMapping)
throws ConverterException, ImportParameterException {
Optional<String> sepOpt = this.getParameterValue(PARAM_SEP, converterDefinition.getParameters());
if (!sepOpt.isPresent()) {
throw new ImportParameterException("Missing parameter: " + PARAM_SEP);
}
Optional<String> geomColOpt = this.getParameterValue(PARAM_GEOM_COL, converterDefinition.getParameters());
if (!geomColOpt.isPresent()) {
throw new ImportParameterException("Missing parameter: " + PARAM_GEOM_COL);
}
List<SpatialResource> spatialResources = new ArrayList();
InputStream input = getInputStream(converterDefinition, dataset);
...
return spatialResources;
}
@Override
public List<IndicatorValue> convertIndicators(ConverterDefinitionType converterDefinition,
Dataset dataset,
IndicatorPropertyMappingType propertyMapping) throws ConverterException {
Optional<String> sepOpt = this.getParameterValue(PARAM_SEP, converterDefinition.getParameters());
if (!sepOpt.isPresent()) {
throw new ImportParameterException("Missing parameter: " + PARAM_SEP);
}
Optional<String> geomColOpt = this.getParameterValue(PARAM_GEOM_COL, converterDefinition.getParameters());
if (!geomColOpt.isPresent()) {
throw new ImportParameterException("Missing parameter: " + PARAM_GEOM_COL);
}
List<Indicator> indicators = new ArrayList();
InputStream input = getInputStream(converterDefinition, dataset);
...
return indicators;
}
}
The class org.n52.kommonitor.importer.utils.ImportMonitor
is responsible for monitoring import processes. Note, that
for each HTTP request a separate instance of this class will be revoked in order to monitor errors only for a
corresponding request. The @RequestScope
annotations ensures that there are no concurrency issues if multiple requests
are handled at the same time. Up to now, only conversion errors will be recorded. So, feel free to extend this class for
monitoring additional errors.
The project comes with the latest API and model classes. However, if you wish to customize the KomMonitor Importer API or the DataManagement API client, you'll find the OpenAPI spec documents at https://gitlab.fbg-hsbo.de/kommonitor/kommonitor-api-specs. You can customize the OpenAPI definitions just as you need it. Make sure, that the customized specs comes available as artifacts within your build environment. So just build you local project with Maven.
In order to update the Importer API, some Maven build profiles are included within the single modules. Those profiles are
configured to automate code generation of the corresponding API and model classes from the OpenAPI specification.
Just run mvn compile -Pgenerate-models
from kommonitor-importer-models, mvn compile -Pgenerate-api
from kommonitor-importer-api
or mvn compile -Pgenerate-client
from kommonitor-datamanagement-api-client. But, be careful with auto-generation of
new API or model classes. Some existing classes may be overwritten, so you should check all the changed classes after
the code generation.
Name | Organization | |
---|---|---|
Christian Danowski-Buhren | Bochum University of Applied Sciences | christian.danowski-buhren@hs-bochum.de |
Sebastian Drost | 52°North GmbH | s.drost@52north.org |
Andreas Wytzisk | Bochum University of Applied Sciences | Andreas-Wytzisk@hs-bochum.de |
- Department of Geodesy, Bochum University of Applied Sciences
- Department for Cadastre and Geoinformation, Essen
- Department for Geodata Management, Surveying, Cadastre and Housing Promotion, Mülheim an der Ruhr
- Department of Geography, Ruhr University of Bochum
- 52°North GmbH, Münster
- Kreis Recklinghausen