🐛 Fix DatashaderRasterizer for GeoDataFrame wrapped in StreamWrapper #104
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In
DatashaderRasterizer
(from zen3geo v0.3.0 to v0.6.0), the conversion of ageopandas.GeoDataFrame
tospatialpandas.GeoDataFrame
happens in this try-except statement:zen3geo/zen3geo/datapipes/datashader.py
Lines 214 to 223 in 377992b
This was added at commit 6805418 in #35. In some rare cases, the conversion fails when the vector geometry is wrapped in a StreamWrapper class (but not always?), due to spatialpandas not being able to detect the presence of a geometry column. The traceback looks like this:
Relevant logic in spatialpandas is at https://github.com/holoviz/spatialpandas/blob/aeeba42751fd7cc16fd7e4a142b053fb9cd0c033/spatialpandas/geodataframe.py#L26-L32
Since
spatialpandas
doesn't use ducktyping, fixing this would require thevector
variable to be serialized from a StreamWrapped instance to ageopandas.GeoDataFrame
orgeopandas.GeoSeries
object. This could happen by:vector.geometry
which returns ageopandas.GeoSeries
, and pass that intospatialpandas.GeoDataFrame
.vector.loc[:]
to return a view of thegeopandas.GeoDataFrame
orgeopandas.GeoSeries
object.The bugfix in this PR will use (1), basically following the original logic prior to 6805418 in #35. Disadvantage is that the columns from the GeoDataFrame will be lost on converting to GeoSeries, but we are not making use of any other column beside the geometry column on the rasterization step anyway (though it is usually nice to not delete data until necessary).
Also, this PR makes the try-except statement catch a more specific ValueError (since the 'Polygon' dtype should be supported). Ideally there would be a unit test to cover the
ValueError: A spatialpandas GeoDataFrame must contain at least one spatialpandas GeometryArray column
case, but it's hard to create a minimal reproducible example, so just testing something that hits the same code line instead.Patches #35.