You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am investigating a parquet-cpp issue where the data in a file written by the parquetjs library is not readable by the parquet-cpp library. This also seems to be related to another open issue in the parquetjs library: #75 .
This is happening since parquetjs writes data pages using the DataPageV2 format, which doesn't appear to have widespread support among most Parquet readers, like parquet-cpp. I have opened a pull request in parquet-cpp to improve its DataPageV2 support here.
I see that there is some logic in parquetjs' writer.js to write DataPageV1 pages instead, but this is only accessible through the ParquetEnvelopeWriter API. It would be better if the ParquetWriter class could also default to writing DataPageV1 pages to improve compatibility with other Parquet readers.
The text was updated successfully, but these errors were encountered:
Hello,
I am investigating a
parquet-cpp
issue where the data in a file written by theparquetjs
library is not readable by theparquet-cpp
library. This also seems to be related to another open issue in theparquetjs
library: #75 .This is happening since
parquetjs
writes data pages using the DataPageV2 format, which doesn't appear to have widespread support among most Parquet readers, likeparquet-cpp
. I have opened a pull request inparquet-cpp
to improve its DataPageV2 support here.I see that there is some logic in parquetjs'
writer.js
to write DataPageV1 pages instead, but this is only accessible through theParquetEnvelopeWriter
API. It would be better if theParquetWriter
class could also default to writing DataPageV1 pages to improve compatibility with other Parquet readers.The text was updated successfully, but these errors were encountered: