-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(python): Expose transform as a submodule for pyiceberg_core #628
Conversation
Signed-off-by: Xuanwo <github@xuanwo.io>
Hi, @sungwy, would you like to take a review? I believe this way is more pythonic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Xuanwo thank you for putting up this PR - in hindsight, I agree that this approach is better.
There's an outstanding issue with pyo3 that prevents the functions defined within a module to be imported like below:
from pyiceberg_core.transform import identity, bucket, year, day, month, hour, truncate
This will result in:
ModuleNotFoundError: No module named 'pyiceberg_core.transform'
In my original PR, I decided that exposing a class and static methods instead of a module would save us from this confusion. But I think it makes sense to settle on module pyiceberg.transform
in the hope that this issue will be resolved in the future.
Oh, I see the problem. Let me take a look. |
Signed-off-by: Xuanwo <github@xuanwo.io>
Hi, @sungwy, I fixed this issue and add a test for it. Please take a look, thanks! |
Signed-off-by: Xuanwo <github@xuanwo.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh man that's amazing. Thank you for the fix!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks great, thanks for the refactor! added a minor comment.
I'll also refactor #604 to follow this format.
pub fn truncate(py: Python, array: PyObject, width: u32) -> PyResult<PyObject> { | ||
apply(py, array, Transform::Truncate(width)) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we also include the unknown
transform?
https://github.com/apache/iceberg/blob/8d97d54756a1b8ba1986858b2461013864b9e37f/core/src/main/java/org/apache/iceberg/Partitioning.java#L50-L103
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can add them in follow up PRs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sg!
Thank you @sungwy and @kevinjqliu for the quick review. We can implement more transforms in following PRs. And I will merge this one first 😘 |
This PR will expose transform as a submodule for pyiceberg_core.
Also, this PR removes not useful hello_world function.