Support writing Parquet _metadata
and _common_metadata
files
#5039
Labels
core
Core development tasks
feature request
New feature or request
parquet
Related to the Parquet integration
Milestone
See:
io.deephaven.parquet.table.layout.ParquetMetadataFileLayout
https://arrow.apache.org/docs/python/generated/pyarrow.parquet.write_metadata.html
https://stackoverflow.com/questions/36739940/parquet-difference-between-metadata-and-common-metadata
We support consuming Parquet metadata files to speed up multi-partition reading and/or support better partitioning column typing. We should add support to write these files, as well. We may choose to take inspiration from the APIs of other libraries, but it may be sufficient to simply add optional
_metadata
and_common_metadata
path instructions the writing code inParquetTools
and below, especially the multi-table write methods.We may also consider whether we need any additional tooling for reading and/or updating these files. It would be nice to be able to verify correctness of existing metadata files, or add metadata files to existing data sets.
The text was updated successfully, but these errors were encountered: