Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: create Dagster integration page #2159

Merged
merged 4 commits into from
Mar 7, 2024
Merged

docs: create Dagster integration page #2159

merged 4 commits into from
Mar 7, 2024

Conversation

avriiil
Copy link
Contributor

@avriiil avriiil commented Feb 1, 2024

This adds an Integration page to the docs re: Dagster.

Copy link

github-actions bot commented Feb 1, 2024

ACTION NEEDED

delta-rs follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

@avriiil
Copy link
Contributor Author

avriiil commented Feb 1, 2024

@roeap - I've moved the contents from the GDoc into this draft PR. LMK when you've had a chance to look over the rest and we can take it from there!

Comment on lines +155 to +162
def dataset_partitioned(
context,
clean_dataset: pa.Table,
) -> pa.Table:
animals = context.asset_partition_key_for_output()
table = clean_dataset

return table.filter(pc.field("animals") == animals)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having trouble getting the PyArrow code in the partitioning section to work. I must be overlooking something. Would very much appreciate an extra set of eyes to help me find my blind spot 🙂

The pandas version listed here works fine.

@avriiil avriiil changed the title Create Dagster integration page docs: create Dagster integration page Feb 1, 2024
@ion-elgreco
Copy link
Collaborator

Maybe worthwhile to include also these changes in the docs: dagster-io/dagster#19343

@dennyglee
Copy link
Collaborator

Would also like to suggest perhaps we add Dagster to the delta.io/integrations page if you think that makes sense?

@avriiil
Copy link
Contributor Author

avriiil commented Feb 6, 2024

@dennyglee - sounds good to me! I will finalize the text with @roeap here and then port it over to delta.io/integrations with pyspark code examples.

@MrPowers
Copy link
Contributor

@dennyglee - good call to get this added to the Delta Lake website, I created an issue for this work.

@MrPowers
Copy link
Contributor

MrPowers commented Mar 4, 2024

Hey @avriiil - can you please take this out of draft mode and add this page to the table of contents (in this file)? Thank you!

@avriiil avriiil marked this pull request as ready for review March 7, 2024 10:48
@avriiil avriiil requested a review from MrPowers as a code owner March 7, 2024 10:48
Copy link
Contributor

@MrPowers MrPowers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@MrPowers MrPowers merged commit fe36b13 into delta-io:main Mar 7, 2024
23 checks passed
defs = Definitions(
assets=all_assets,
resources={
"io_manager": PolarsDeltaIOManager(
Copy link
Collaborator

@ion-elgreco ion-elgreco Mar 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@avriiil this is the io manager from dagster-polars, which I also did some updates to a while ago but this implementation is relying on UPathIOmanager instead of DBIOManager, I would say it's better to change this to DeltaLakePolarsIOManager since that one uses native delta partitioning.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ion-elgreco!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants