Questions about native encodings with geometry_types #239
Replies: 1 comment
-
When writing a geometry column that contains more than one type of geometry, the single-geometry encodings (do we call them "native" anywhere?) aren't appropriate and I would expect that to error. (Apologies if I missed the point there).
I am not aware of any discussions about that, but the GeoArrow spec on which it is based does support M coordinates and I don't think there would be any debate about how we would store them. import pyarrow as pa
from geoarrow import pyarrow as ga
from geoarrow.pyarrow import io
tab = pa.table({"geom": ga.as_geoarrow(["POINT ZM (1 2 3 4)"])})
io.write_geoparquet_table(tab, "out.parquet", geometry_encoding=io.geoparquet_encoding_geoarrow())
io.read_geoparquet_table("out.parquet").schema.field("geom").type.storage_type
#> StructType(struct<x: double, y: double, z: double, m: double>) It would be helpful to document your use case for M coordinates...I happen to agree that supporting them is important (for completeness with existing specifications), but I struggle to find real-world use cases for Parquet where this matters (not because it doesn't, but because I am more of a tool developer than a geospatial data user these days!) |
Beta Was this translation helpful? Give feedback.
-
I'm implementing a geoparquet reader using the native encodings and had some questions about the behavior of the encodings when different geometry types are set
union geometry types - based on the discussion here (Add GeoArrow encoding as an option to the specification #189 (comment)) it seems like union types are not supported in the new native encoding. How should we handle that if we have a geometry_type that is a union but the encoding is a native type? Right now I'm erroring out but wanted to make sure that's the intended behavior
z values - does a 3d type in the geometry_type field (ie "POINT Z") add a required z field to the parquet point encoding?
general question - We use M values a lot and one thing that's going to stop us from using the new native encodings is that those aren't supported. Are there any plans to support that? I know they're also not supported on WKB either but since that's a well defined spec, we're just writing them out anyway and letting wkb parsers figure it out since it seems like most can.
Beta Was this translation helpful? Give feedback.
All reactions