Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GeoPackage support #456

Merged
merged 6 commits into from
Jul 3, 2023

Conversation

adrien-berchet
Copy link
Member

@adrien-berchet adrien-berchet commented Jun 25, 2023

Solves #410

This PR adds support of GeoPackage using Spatialite.

Main things to check:

  • Should we automatically activate the Amphibious mode?
  • Should we automatically activate the AutoGPKG mode?
  • How to use ST_Transform with a GeoPackage? Should we add a specific function to create the required SRS table?

@adrien-berchet adrien-berchet marked this pull request as draft June 25, 2023 20:28
@adrien-berchet
Copy link
Member Author

@caspervdw could you try this version please?

Copy link

@caspervdw caspervdw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I'll now proceed with the testing

doc/spatialite_tutorial.rst Outdated Show resolved Hide resolved
geoalchemy2/admin/dialects/sqlite.py Outdated Show resolved Hide resolved
geoalchemy2/admin/dialects/sqlite.py Outdated Show resolved Hide resolved
geoalchemy2/admin/dialects/sqlite.py Outdated Show resolved Hide resolved
geoalchemy2/admin/dialects/sqlite.py Outdated Show resolved Hide resolved
geoalchemy2/admin/dialects/sqlite.py Outdated Show resolved Hide resolved
tests/__init__.py Outdated Show resolved Hide resolved
tests/test_functional_sqlite.py Show resolved Hide resolved
tests/test_functional_sqlite.py Outdated Show resolved Hide resolved
if is_GeoPkg:
if not dbapi_conn.execute("SELECT CheckGeoPackageMetaData();").fetchone()[0]:
# This only works on the main database
dbapi_conn.execute("SELECT gpkgCreateBaseTables();")
Copy link

@caspervdw caspervdw Jun 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it was already there: but the InitSpatialMetadata() really takes a ridiculous amount of time. It can be mitigated by doing InitSpatialMetadata(1), and even more by calling "PRAGMA journal_mode = MEMORY" before, but still, I think it is too heavy to do by default.

Maybe it is good to split it out in an "init_spatialite" and "init_geopackage" function, for the user to call.

Next to that, as InitSpatialMetadata() populates the srs table, I think it is good to try to populate the geopackage equivalent here as well. Not sure how though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it's super long, that's why I use already initialized DBs in the tests.
Using InitSpatialMetadata(1) could be interesting, though it would be a breaking change (or we could just let the use pass a parameter to load_spatialite).
Anyway, I agree it should be consistent between SpatiaLite and gpkg, so I will see what's the best option.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the end I added the possibility to pass an init_mode to the function, so it is possible to initialize the table with no SRID and then add the relevant ones manually. Unfortunately, it is not possible to pass an argument to the sqlalchemy.event.listen function, so if someone wants to use this argument he has to either create a new function that calls load_spatialite with an hard-coded init_mode or call load_spatialite manually after the connection is created.

VALUES (
'{}',
'features',
NULL,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • QGis and fiona require an identifier; as every table can only have 1 geometry column I suggest putting the table name over here.
  • The desciption should be an empty string, not NULL

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let's make it compatible with QGIS, it makes sense.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made these changes and now it seems very close to a GeoPackage created by QGIS. Does it look good to you?

@caspervdw
Copy link

@adrien-berchet , I tested it and there seems to be a working geopackage generated. Some issues I could solve at my side threedi-schema, others I noted in the review.

One big suggestion is making this a different dialect; it seems logical looking at the code, and the "auto-discovery" gave some issues for me. For the rest it is looking good!

@adrien-berchet
Copy link
Member Author

Hi @caspervdw
Thanks for your feedback and review!
The specific dialect is an interesting idea, as it would solve the different ways of identifying GPKG in load_spatialite and in is_gpkg. I suspect it's bit overkill but I will see how we can do this.

@adrien-berchet adrien-berchet marked this pull request as ready for review July 2, 2023 20:54
tests/conftest.py Outdated Show resolved Hide resolved
@caspervdw
Copy link

@adrien-berchet I am 👍 on merging this. Two observations during testing:

Alembic compatibility

To make this work with alembic, you should register it as a dialect:

from alembic.ddl.impl import DefaultImpl

class GpkgImpl(DefaultImpl):
    __dialect__ = 'geopackage'

ogr-info complains

If I run ogr-info on a gpkg generated using this PR I get:

Warning 1: Field format 'VARCHAR(50)' not supported

Looking at the gpkg spec (section 1.1.1.1.3) there is indeed no mention of a VARCHAR. It should be TEXT or TEXT(50).

These two are no blockers for us however, so, great work!

@adrien-berchet
Copy link
Member Author

Alembic compatibility

To make this work with alembic, you should register it as a dialect:

from alembic.ddl.impl import DefaultImpl

class GpkgImpl(DefaultImpl):
    __dialect__ = 'geopackage'

In geoalchemy2.alembic_helpers I register it like this:

class GeoPackageImpl(SQLiteImpl):
    """Class to copy the Alembic implementation from SQLite to GeoPackage."""

    __dialect__ = "geopackage"

Isn't it enough? Did you import geoalchemy2.alembic_helpers in your alembic script? Because this module is not automatically imported with GeoAlchemy2, it has to be explicitly imported.

ogr-info complains

If I run ogr-info on a gpkg generated using this PR I get:

Warning 1: Field format 'VARCHAR(50)' not supported

Looking at the gpkg spec (section 1.1.1.1.3) there is indeed no mention of a VARCHAR. It should be TEXT or TEXT(50).

I'm not able to reproduce this, do you know to which table and field this warning is related?
Is it only when you use the geoalchemy2.admin.dialects.geopackage.populate_spatial_ref_sys() function?

These two are no blockers for us however, so, great work!

Great, thanks!
I will try to fix these two issues so we can release a clean version soon.

@adrien-berchet
Copy link
Member Author

@caspervdw I performed a few tests with Alembic and everything seems to work in my case so I don't know what's going on on your side. I still think it's because you did not import geoalchemy2.alembic_helpers.

@caspervdw
Copy link

@caspervdw I performed a few tests with Alembic and everything seems to work in my case so I don't know what's going on on your side. I still think it's because you did not import geoalchemy2.alembic_helpers.

Sorry I indeed didn’t import the helpers. So that’s ok then!

For ogr-info; I used the command line tool on Ubuntu 22.04. It is GDAL 3.4.1 .

@adrien-berchet
Copy link
Member Author

Sorry I indeed didn’t import the helpers. So that’s ok then!

Ok, no problem :)

For ogr-info; I used the command line tool on Ubuntu 22.04. It is GDAL 3.4.1 .

I'm still not able to reproduce this issue locally but there is only one place where I create VARCHAR columns: in geoalchemy2.admin.dialects.geopackage.populate_spatial_ref_sys(). So after reading again the GeoPackage specs I created a spatial_ref_sys view instead of a table, as described here: http://www.geopackage.org/spec/#_gpkg_spatial_ref_sys
And I renamed the populate_spatial_ref_sys() function into create_spatial_ref_sys_view to be more consistent. Everything still work for me and there is no VARCHAR anywhere so I think we are good now.

@adrien-berchet adrien-berchet merged commit 15ba005 into geoalchemy:master Jul 3, 2023
@adrien-berchet adrien-berchet deleted the gpkg_support branch July 3, 2023 19:25
@caspervdw
Copy link

I'm still not able to reproduce this issue locally but there is only one place where I create VARCHAR columns: in geoalchemy2.admin.dialects.geopackage.populate_spatial_ref_sys(). So after reading again the GeoPackage specs I created a spatial_ref_sys view instead of a table, as described here: http://www.geopackage.org/spec/#_gpkg_spatial_ref_sys And I renamed the populate_spatial_ref_sys() function into create_spatial_ref_sys_view to be more consistent. Everything still work for me and there is no VARCHAR anywhere so I think we are good now.

Very sorry, I tried to reproduce, and it appeared I had a String(100) in my table definition. If I replace it with Text(100), the column is neatly created with type TEXT(100). So we're good here.
It might be good though to limit the number of allowed types when using the gpkg driver to those specified in the spec (section 1.1.1.1.3). But certainly not required for our purposes.

In the end I added the possibility to pass an init_mode to the function, so it is possible to initialize the table with no SRID and then add the relevant ones manually. Unfortunately, it is not possible to pass an argument to the sqlalchemy.event.listen function, so if someone wants to use this argument he has to either create a new function that calls load_spatialite with an hard-coded init_mode or call load_spatialite manually after the connection is created.

That's all right, you can do it with a lambda like this:

listen(engine, "connect", lambda x, y: load_spatialite(x, y, init_mode="EMPTY")

@adrien-berchet
Copy link
Member Author

Very sorry, I tried to reproduce, and it appeared I had a String(100) in my table definition. If I replace it with Text(100), the column is neatly created with type TEXT(100). So we're good here. It might be good though to limit the number of allowed types when using the gpkg driver to those specified in the spec (section 1.1.1.1.3). But certainly not required for our purposes.

Ahah ok, no problem.
For now I'm going to let the users define whatever type they want because GeoAlchemy 2 should only handle spatial types. Such limitation you propose should go to SQLAlchemy IMO.

That's all right, you can do it with a lambda like this:

listen(engine, "connect", lambda x, y: load_spatialite(x, y, init_mode="EMPTY")

Exactly. I improve the docs for this in #457 to be more complete.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants