Moves scaling out of experimental #286

elijahbenizzy · 2023-08-22T21:14:58Z

[Short description explaining the high-level reason for the pull request]

Changes

How I tested this

Notes

Checklist

PR has an informative and human-readable title (this will be pulled into the release notes)
Changes are limited to a single goal (no scope creep)
Code passed the pre-commit check & code is left cleaner/nicer than when first encountered.
Any change in functionality is tested
New functions are documented (with a description, list of inputs, and expected output)
Placeholder code is flagged / future TODOs are captured in comments
Project documentation has been updated if adding/changing functionality.

- h_spark - h_dask - h_ray All move to plugins. We preserve the name `h_...` to avoid duplicate imports from the library themselves. Note that we also deprecate the idea of having "_implementations". People will be importing these, so we want them name to be easy to refer to/remember. This is why polars_implementations is deprecated, and we instead use h_polars. We leave in references to all previously released constructs (note, the new pyspark API has not been released so its OK to remove that from the experimental.h_spark file).

hamilton/registry.py

skrawcz · 2023-08-23T05:40:23Z

Need a README somewhere too (maybe in plugins) to explain how to add/adjust and name files.

skrawcz

Plugins will get large. We also don't want very large files either.

So the evolution should be into packages, right?
h_spark.py -->

init.py
... modules organized however we want

And then similarly for *_extensions.py...?

elijahbenizzy · 2023-08-23T15:27:21Z

Plugins will get large. We also don't want very large files either.

So the evolution should be into packages, right? h_spark.py -->

init.py

... modules organized however we want

And then similarly for *_extensions.py...?

Yeah, I think that'll be the best way for it to evolve. Should be able to do it and keep compatible references later on, no need to break into packages yet.

This is for getting it to work with pyspark. While it does have natural column/dataframe objects, these are not usable in the same way. Thus we don't want to register the same set of constructs as we would, say, with polars or pandas. We allow plugins to opt out by specifying COLUMN_FRIENDLY_DF_TYPE = False, which defaults to true.

Pyspark has a bunch of hidden depedencies -- these are solved by the sql and connect targets

skrawcz

I think things work.

elijahbenizzy force-pushed the scaling-out-of-experimental branch 2 times, most recently from a749b00 to 2fa412c Compare August 22, 2023 21:51

elijahbenizzy changed the title ~~WIP~~ Moves scaling out of experimental Aug 22, 2023

elijahbenizzy force-pushed the scaling-out-of-experimental branch 5 times, most recently from 14c5519 to c2720cd Compare August 22, 2023 22:31

elijahbenizzy requested a review from skrawcz August 22, 2023 22:33

elijahbenizzy force-pushed the scaling-out-of-experimental branch 3 times, most recently from 09a20b6 to f8241da Compare August 23, 2023 05:03

elijahbenizzy force-pushed the scaling-out-of-experimental branch 5 times, most recently from 6d30b24 to 8236b47 Compare August 23, 2023 05:18

skrawcz reviewed Aug 23, 2023

View reviewed changes

hamilton/registry.py Outdated Show resolved Hide resolved

skrawcz reviewed Aug 23, 2023

View reviewed changes

elijahbenizzy force-pushed the scaling-out-of-experimental branch from 8236b47 to a445aa3 Compare August 23, 2023 19:23

elijahbenizzy force-pushed the scaling-out-of-experimental branch from a445aa3 to 36b7414 Compare August 23, 2023 19:29

Fixes pyspark depenencies in install target

f190f6e

Pyspark has a bunch of hidden depedencies -- these are solved by the sql and connect targets

elijahbenizzy force-pushed the scaling-out-of-experimental branch from 36b7414 to f190f6e Compare August 23, 2023 20:30

elijahbenizzy added 2 commits August 23, 2023 13:34

Bumps version to 1.27.0

eb1f93c

Improves pyspark README

18e7d65

elijahbenizzy force-pushed the scaling-out-of-experimental branch from 05ce983 to 18e7d65 Compare August 23, 2023 21:50

elijahbenizzy force-pushed the scaling-out-of-experimental branch 2 times, most recently from e460bde to 48084cc Compare August 24, 2023 00:12

Adds ModuleNotFoundError to try/catch for spark_extensions

0bcbcec

elijahbenizzy force-pushed the scaling-out-of-experimental branch from 48084cc to 0bcbcec Compare August 24, 2023 00:13

skrawcz approved these changes Aug 24, 2023

View reviewed changes

elijahbenizzy merged commit bc9f2b9 into main Aug 24, 2023

elijahbenizzy deleted the scaling-out-of-experimental branch August 24, 2023 00:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Moves scaling out of experimental #286

Moves scaling out of experimental #286

elijahbenizzy commented Aug 22, 2023

skrawcz commented Aug 23, 2023

skrawcz left a comment

elijahbenizzy commented Aug 23, 2023

skrawcz left a comment

Moves scaling out of experimental #286

Moves scaling out of experimental #286

Conversation

elijahbenizzy commented Aug 22, 2023

Changes

How I tested this

Notes

Checklist

skrawcz commented Aug 23, 2023

skrawcz left a comment

Choose a reason for hiding this comment

elijahbenizzy commented Aug 23, 2023

skrawcz left a comment

Choose a reason for hiding this comment