-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Override nodes at module import - Feature Sketch #1134
Override nodes at module import - Feature Sketch #1134
Conversation
Sharing Slack comments for lineage Thanks for the contribution! I see the thread of issues and Slack convo. An idea would be to use find_functions() , which is a public-facing API part of hamilton.graph_utils. Ideally, it would allow us to only make "surface level" changes closer to the Builder. Default behaviorWe can keep .with_modules() as-is and avoid changes to other internals (Driver, FunctionGraph). This ensures safety for all Hamilton users Override behaviorWhen specifying .with_modules(..., allow_overrides=True), def with_modules(*modules):
if allow_overrides:
functions_seen = set()
for module in modules:
functions = graph_utils.find_functions(module)
for name, callabl in functions:
if name in functions_seen
del getattr(module, name)
else:
functions_seen.add(name)
self.modules.extend(modules) |
@zilto this specifically doesn't allow for node overriding, this is only function overriding. Burt also, there is no inherent problem changing the core functionality -- it shouldn't be static. In this case the graph is responsible for assembling the nodes. It also has the current error message that breaks, so it makes sense to change that. We have more flexibility if we adjust the internal API to change later than using user-facing functions, which tie us to the external API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! Left some comments on specifics. High-level:
- You'll want docs (see slack for how to add/develop). Make sure to reference caveats (why you might/might not want to do this)
- Wondering if we want more in the example -- a README/notebook might be nice. If you have a compelling version of why this feature is useful that woudl be nice, otherwise don't want to upscope this! You're not the one that asked for this feature, you're the one building it :)
@@ -0,0 +1,22 @@ | |||
import module_a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a README in the examples module with how to run/what we learn? Maybe a notebook?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am on the fence about having this in the examples as I don't really have a use case for it. Maybe we remove it for now and ask Yijun Tang who opened the issue if he can give a rough sketch of how he will use it. Happy to implement it, just don't want it to be a very contrived example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep! So just a simple README then might be nice -- one that says what this does/a quick description of the feature. Can ignore a notebook until we have it ready.
hamilton/driver.py
Outdated
@@ -2010,13 +2031,25 @@ def build(self) -> Driver: | |||
adapter=lifecycle_base.LifecycleAdapterSet(*adapter), | |||
) | |||
|
|||
if not self.module_overrides: # if override on than this doesn't matter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, interesting... This is actually a different type of module override -- those that are the same. Can you elaborate more on the case that it broke? Feels like an upstream thing... I'm not 100% sure I trust the module in module_set
-- identity is a little weird/what happens if you make an update but keep an old reference around?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, if you do something like
.with_modules(module_A, module_A)
hamilton will just add it to the list of modules and treat them as two different ones. If you print the attribute, you get two pointers in the list
[<module 'module_A' from 'path to module A'>,<module 'module_A' from 'path to module A'>, <module 'module_A' from 'path to module A'>,<module 'module_A' from 'path to module A'>]
Then when you try to build the driver, because they are treated as separate you run into the issue that you have same-named functions and get an error that functions have to have different names.
The module_set
tries to get create an ordered set out of the list we use for storing the module references.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So yeah, what I"m saying is thyis feels like a different use-case, and has a different problem that informed it?
I guess I'm just worried that it might hide unintended behavior...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair, I admit it's probably a niche thing that may not come up again, so I'm happy to take it out. In case someone else flags it, it's not hard to add it later on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I mean, maybe a separate PR? Or an issue with the proposed fix?
I feel like the README is the most concise description of this functionality, so I am wondering if we should link to it more? |
Looking good -- to get it out:
|
Yep, agreed. Can link from docstring and/or from concepts section. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good, thank you!
This addresses #1120 and adds the capability to override nodes from later imported modules. It also suggests how to handle the case when we import the same module twice.
Changes
Adds
allow_module_overrides
as a method in the Builder and as an attribute in the Driver. If set true, it allows the node with the same name to be overwritten.How I tested this
There are only two small mock scripts to have a running sketch; I will add unit tests.
Notes
Checklist