Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update ETL docu #227

Open
SlowMo24 opened this issue Apr 15, 2020 · 10 comments
Open

Update ETL docu #227

SlowMo24 opened this issue Apr 15, 2020 · 10 comments
Assignees
Labels
documentation Readmes, Javadoc, Tutorials, etc. help wanted Extra attention is needed

Comments

@SlowMo24
Copy link
Contributor

Bug
Using the current ETL docu creates an H2-file that cannot be directly used in the Ohsome-API ("extract.region" not parsable -> it is empty).

To Reproduce
Run the ETL as documented here: https://github.com/GIScience/oshdb/tree/master/oshdb-tool/etl then start a local ohsome-api as docuemented here: https://gitlab.gistools.geog.uni-heidelberg.de/giscience/big-data/ohsome/ohsome-api/tree/master

@SlowMo24 SlowMo24 added the bug Something isn't working as expected label Apr 15, 2020
@rtroilo
Copy link
Member

rtroilo commented Apr 15, 2020

@FabiKo117 the problem here is that the extract.region is not an mandatory parameter (--ploy or --bbox) of the Extract step.

@FabiKo117
Copy link
Contributor

But I think the ohsome API would always need an info about the extract region to react accordingly if a request is partially/fully outside of the region where data is available.

@tyrasd
Copy link
Member

tyrasd commented Apr 16, 2020

the ohsome API would always need an info about the extract region

yeah, but the ohsome API is not the only thing in the world potentially using an OSHDB extract. 😅

There are also theoretically some situations where the region is unknown (e.g. when transforming an OSM history file from an unknown source). Therefore it IMO makes sense that this is an optional parameter in the ETL tool.

--

I think this is simply "just" a missing documentation problem, not really a bug.

@tyrasd tyrasd added documentation Readmes, Javadoc, Tutorials, etc. help wanted Extra attention is needed and removed bug Something isn't working as expected labels Apr 16, 2020
@rtroilo
Copy link
Member

rtroilo commented Apr 16, 2020

There are 2 more infos in the extract,

  • header.bbox (which comes from the pbf header itself, not always set!)
    and
  • data.bbox (which is computed from the node locations)

I'm not sure if we want this but the ohsome-api could fall back to one of those with a warning.

@SlowMo24
Copy link
Contributor Author

SlowMo24 commented Apr 16, 2020

I think our tools should work seamlessly together with default config. The Ohsome-api handles missing metadata just fine i guess. So in my opinion ETL could also either place a parsable dummy value for extract.region if it is not given or omit the information. ATM it creates an invalid geojson which crashes the ohsome api.

@FabiKo117
Copy link
Contributor

yeah, but the ohsome API is not the only thing in the world potentially using an OSHDB extract.

I did not say that it should become a mandatory parameter :P

I am totally fine with just adding this info to the docu, so people that want to use it for the ohsome API know what they have to do.

@SlowMo24
Copy link
Contributor Author

When documenting it also has to be mentioned that the ohsome-API only accepts a GeoJSON-geometry, not a Feature or FeatureCollection. I just faced that problem with an extract downloaded from our download server.

@tyrasd
Copy link
Member

tyrasd commented Apr 16, 2020

data.bbox (which is computed from the node locations) […] the ohsome-api could fall back to one of those with a warning.

@rtroilo In general, we don't want to fall back to this in the ohsome-api. This is because the computed bbox of an OSM (history) extract is not suitable to determine the "usable coverage area" of the extract. A user of the ohsome-api would not see the warning, if there was one while starting up the API.

@tyrasd
Copy link
Member

tyrasd commented Apr 16, 2020

So in my opinion ETL could also either place a parsable dummy value for extract.region if it is not given or omit the information.

@SlowMo24 What dummy value should be included? something like {"type": "Polygon", "coordinates": [[[0,0],[0,0],[0,0],[0,0],[0,0]]]} would be the only "not incorrect" thing to put in as a dummy value, but isn't very useful: Instead of crashing the ohsome-api's startup routine, it would kind of crash every incoming request (which is worse in my opinion).

IMHO the current situation is almost fine: The ohsome-api requires special metadata field to be set, which the ETL tool cannot determine and set automatically. Maybe instead of crashing the ohsome api should print a meaningful error message, for example:

Required field "extract.region" in metadata table of oshdb source not found. If you work with a self-produced OSHDB H2 extract, please don't forget to specify the necessary extract region information as documented in .

@Hagellach37 Hagellach37 added this to the release 1.0 milestone Oct 20, 2022
@Hagellach37
Copy link

This will be addressed by @rtroilo when re-integrating the importer.

@tyrasd tyrasd removed this from the release 1.0.0 milestone Mar 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Readmes, Javadoc, Tutorials, etc. help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

5 participants