-
Notifications
You must be signed in to change notification settings - Fork 435
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Selectively overwrite data with python #1101
Selectively overwrite data with python #1101
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for writing this! I'd like a few more data types tested, and then this should be good to go. 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for testing the additional types. I had a few ideas for some quick additional tests, if you could add those too. Then I think this should be ready to go.
ad8e2ff
to
108ac88
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for refactoring the tests and adding more. They look good.
I think we can use the existing DeltaJSONEncoder and then fix some other small issues and we are good-to-go.
Could you make sure to rebase and re-run make format
. We just changed our linter to ruff
so there may be some new rules enabled :)
# Conflicts: # python/tests/test_writer.py
# Conflicts: # python/tests/test_writer.py
* simplify parametrized test * add test cases
108ac88
to
48e8737
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent! I'm excited for when we release this :)
Thanks @ismoshkov!
# Description Currently high-level python writer isn't support partial partition overwrite. This PR enable usage of partitions filtering for writing data The functionlity is similar to: https://docs.databricks.com/delta/selective-overwrite.html The logic checks that data should contains only partitions that passing filtering. # Documentation ```python write_deltalake( delta_path, sample_data, mode="overwrite", partitions_filters=[("partition_a", ">", "1")], ) ``` --------- Co-authored-by: Ilya Moshkov <ilya.moshkov@exosfinancial.com>
Description
Currently high-level python writer isn't support partial partition overwrite.
This PR enable usage of partitions filtering for writing data
The functionlity is similar to:
https://docs.databricks.com/delta/selective-overwrite.html
The logic checks that data should contains only partitions that passing filtering.
Documentation