-
Notifications
You must be signed in to change notification settings - Fork 433
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: create Dagster integration page #2159
Conversation
ACTION NEEDED delta-rs follows the Conventional Commits specification for release automation. The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification. |
@roeap - I've moved the contents from the GDoc into this draft PR. LMK when you've had a chance to look over the rest and we can take it from there! |
def dataset_partitioned( | ||
context, | ||
clean_dataset: pa.Table, | ||
) -> pa.Table: | ||
animals = context.asset_partition_key_for_output() | ||
table = clean_dataset | ||
|
||
return table.filter(pc.field("animals") == animals) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having trouble getting the PyArrow code in the partitioning section to work. I must be overlooking something. Would very much appreciate an extra set of eyes to help me find my blind spot 🙂
The pandas version listed here works fine.
Maybe worthwhile to include also these changes in the docs: dagster-io/dagster#19343 |
Would also like to suggest perhaps we add Dagster to the delta.io/integrations page if you think that makes sense? |
@dennyglee - sounds good to me! I will finalize the text with @roeap here and then port it over to |
@dennyglee - good call to get this added to the Delta Lake website, I created an issue for this work. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
defs = Definitions( | ||
assets=all_assets, | ||
resources={ | ||
"io_manager": PolarsDeltaIOManager( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@avriiil this is the io manager from dagster-polars, which I also did some updates to a while ago but this implementation is relying on UPathIOmanager instead of DBIOManager, I would say it's better to change this to DeltaLakePolarsIOManager since that one uses native delta partitioning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ion-elgreco!
This adds an Integration page to the docs re: Dagster.