-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[speculative idea] enable node redefinitions #65
Comments
Some of this could also be mitigated by reusing a driver/sub-dag reuse e.g. |
OK, so this makes implementation/onboarding easier, but doesn't help readability/maintainability of pipeline... However, its nice to not have to draw up a bunch of different functions. The cool thing about this is this could actually be syntactic sugar for merging nodes together/composing -- doesn't even have to be its own nodes. Implementation could be tricky -- with the current decorator model (decorator => multiple nodes, no context), we'd have to have some notion of state, otherwise we'd have no idea what to call them. OTOH, we could take a click-like approach, which might make this easier to implement: What this does is it allows us to attach this function to the end of the def date() -> pd.Series:
return ['20230101', '20230102', ...']
@date.redefine
def date(date:pd.Series) -> pd.Series:
return pd.datetime(date)
@date.redefine
@extract_columns("date", "bar")
def some_filter_step(date:pd.Series, bar: pd.Series, filter: str) -> pd.Series:
df = pd.DataFrame({"date": date, "bar" : bar})
df = df[data < filter]
return df This also clarifies what is being redefined... This gets ugly when we have extract_columns/multiple outputs that get redefined, but I'm happy for that not to be allowed. That said, we could always do: @some_filter_step.bar.redefine
// or
@some_filter_step.redefine('bar')
def bar(bar: pd.Series) -> pd.Series:
return bar*2 Which would give us the information we need. |
Closing this (for now) as |
Sunday Jan 29, 2023 at 06:49 GMT
Originally opened as stitchfix/hamilton#290
Is your feature request related to a problem? Please describe.
it is common outside of hamilton for people to "redefine columns" without mutation in value, e.g. in the case of filtering & joining data sets, the values wont change, but the indexes might. This right now forces one to rename things in hamilton if you want
to stitch together a single DAG.
Describe the solution you'd like
Idea:
Describe alternatives you've considered
You rename columns in these cases.
Additional context
Not sure how good an idea this is. But it's something to potentially try out and see how it looks.
The text was updated successfully, but these errors were encountered: