You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
According to the parquet data type mappings spec. DecimalType should map to INT32 when precision <= 9, INT64 when precision <= 18, and fixed otherwise.
However, currently arrow write all decimal type as fixed in parquet. This may not be a big issue since the logical type is correct and may require upstream support:
The text was updated successfully, but these errors were encountered:
HonahX
changed the title
[Upstream] Mapping from DecimalType to Parquet physical type not aligned with spec
[Spec][Upstream] Mapping from DecimalType to Parquet physical type not aligned with spec
Jul 16, 2024
Hi @HonahX thank you for raising this issue! Having worked on type casting PRs recently, this one piqued my interest...
It looks like there was a PR merged recently that exposed the feature from C++ Arrow to the Python bindings through a flag store_decimal_as_integer, which was released in version 17.0.0:
Apache Iceberg version
main (development)
Please describe the bug 🐞
According to the parquet data type mappings spec.
DecimalType
should map toINT32
whenprecision <= 9
,INT64
whenprecision <= 18
, andfixed
otherwise.However, currently arrow write all decimal type as
fixed
in parquet. This may not be a big issue since the logical type is correct and may require upstream support:Updated: Thanks @syun64 for providing the link of upstream PR that fix this
Simple test:
The text was updated successfully, but these errors were encountered: