Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

@ignore #1172

Merged
merged 3 commits into from
Oct 12, 2024
Merged

@ignore #1172

merged 3 commits into from
Oct 12, 2024

Conversation

jernejfrank
Copy link
Contributor

@jernejfrank jernejfrank commented Oct 8, 2024

Solves #1168.

Changes

  • implemented ignore decorator
  • used it internally in mutate to automatically hide helper functions --> resolved a TODO
  • exposed it for general use

How I tested this

  • unit test
  • on the abstract mutate example (for both step and mutate)

Checklist

  • PR has an informative and human-readable title (this will be pulled into the release notes)
  • Changes are limited to a single goal (no scope creep)
  • Code passed the pre-commit check & code is left cleaner/nicer than when first encountered.
  • Any change in functionality is tested
  • New functions are documented (with a description, list of inputs, and expected output)
  • Placeholder code is flagged / future TODOs are captured in comments
  • Project documentation has been updated if adding/changing functionality.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Reviewed everything up to be0025a in 29 seconds

More details
  • Looked at 275 lines of code in 8 files
  • Skipped 1 files when reviewing.
  • Skipped posting 5 drafted comments based on config settings.
1. hamilton/function_modifiers/__init__.py:32
  • Draft comment:
    The ignore decorator is being instantiated incorrectly. It should be used directly without calling ignore().
  • Reason this comment was not posted:
    Marked as duplicate.
2. hamilton/function_modifiers/configuration.py:285
  • Draft comment:
    The ignore decorator is being instantiated incorrectly. It should be used directly without calling ignore().
  • Reason this comment was not posted:
    Marked as duplicate.
3. hamilton/function_modifiers/macros.py:624
  • Draft comment:
    The ignore decorator is being instantiated incorrectly. It should be used directly without calling ignore().
  • Reason this comment was not posted:
    Marked as duplicate.
4. hamilton/function_modifiers/macros.py:1373
  • Draft comment:
    The ignore decorator is being instantiated incorrectly. It should be used directly without calling ignore().
  • Reason this comment was not posted:
    Marked as duplicate.
5. hamilton/function_modifiers/configuration.py:280
  • Draft comment:
    The ignore method is being called as a static method of the ignore class, which is unnecessary and confusing. Consider refactoring to make ignore a standalone function or a class method, but not both.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable:
    The comment points out a potential design issue where a static method is used to return an instance of the same class, which can be confusing. This is a valid point as it might lead to misunderstandings about the method's purpose. The suggestion to refactor it to either a standalone function or a class method is actionable and clear.
    The comment assumes that the current design is confusing without considering if there might be a specific reason for this design choice. It also doesn't consider if the method is used elsewhere in a way that justifies its current form.
    While there might be a reason for the current design, the comment highlights a potential improvement in code clarity, which is generally beneficial. The suggestion is clear and actionable, making it a valid comment to keep.
    The comment is valid as it highlights a potential improvement in code clarity by suggesting a refactor of the ignore method. It is actionable and clear, so it should be kept.

Workflow ID: wflow_ngnsvBze9BWOOPMu


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link
Collaborator

@elijahbenizzy elijahbenizzy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Small nits. One thought -- is @ignore (the name I suggested) the best name?

Other thoughts (would love your ideas):

  • @hamilton_ignore
  • @exclude
  • @hamilton_exclude

And another thought -- should we have an option in @mutate and step to not ignore them? E.G. is there any way we may want to also include it in the pipeline (instinct says no here...)

hamilton/function_modifiers/__init__.py Outdated Show resolved Hide resolved
@@ -620,6 +620,8 @@ def step(
they will be converted to a value (a literal)
:return: an applicable with the function applied
"""
# This function will be excluded from the DAG as a node since we are inserting it manually
fn = ignore()(fn)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, thinking about this, I'm not sure we should be injecting this with step -- to me this might be a parameter we want to pipe_output.

The problem is we won't know whether a function is included until later (when it's referred), which feels harder to parse. So maybe remove this, or put this behind a default-off toggle?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skrawcz's made a good point about mixing functions and losing readability. Maybe it's better if we omit it for step, pipe_* and defer it to the user to do it? Then at least each function will have a decorator or "_" prefix and it is clear again

Comment on lines 37 to -38
@mutate(data_2, missing_row=value(["c", 145]))
def _add_missing_value(some_data: pd.DataFrame, missing_row: List[Any]) -> pd.DataFrame:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is okay to not have _ from a API UX consistency view, only because we decorate it with a Hamilton thing -- so you will know that this turns into part of DAG for a node.

@jernejfrank
Copy link
Contributor Author

Looks good! Small nits. One thought -- is @ignore (the name I suggested) the best name?

Other thoughts (would love your ideas):

  • @hamilton_ignore
  • @exclude
  • @hamilton_exclude

And another thought -- should we have an option in @mutate and step to not ignore them? E.G. is there any way we may want to also include it in the pipeline (instinct says no here...)

Having a look with fresh eyes I agree, we should make it explicit that it is for hamilton to avoid confusion. How about hamilton_skip?

A decorator that lets you exclude functions from the DAG.

- Enabled to be used by user
- Useful for internal wiring to avoid changing function names and prefix
  them with underscores
- Add hamilton_skip to mutate to always hide the helper function,
  leaving it up to the user for other decorators/functions to improve
  code readability
@zilto
Copy link
Collaborator

zilto commented Oct 9, 2024

Looks good! Small nits. One thought -- is @ignore (the name I suggested) the best name?
Other thoughts (would love your ideas):

  • @hamilton_ignore
  • @exclude
  • @hamilton_exclude

And another thought -- should we have an option in @mutate and step to not ignore them? E.G. is there any way we may want to also include it in the pipeline (instinct says no here...)

Having a look with fresh eyes I agree, we should make it explicit that it is for hamilton to avoid confusion. How about hamilton_skip?

Thoughts on naming:

  • we have @cache(behavior="ignore")
  • we can think of caching as "skipping a node"
  • As far as I understand, using @ignore is equivalent to prefixing the function with _. Instead of the decorator name indicating "what it does" (ignore, exclude), why not indicate "what it is". For consistency with the Hamilton documentation, I would call this @helper or @utility because you're indicating that this is a utility function.

Other nit:

  • use underscores instead of whitespaces in path examples/mutate/abstract functionality blueprint

@elijahbenizzy
Copy link
Collaborator

I think it should be very specific that Hamilton is the one doing it -- so I'm in favor of @hamilton_exclude.

That said, I don't think this should be the first way to do things (it should be with _), but there are times it'll be useful (the docs here hit it right, it should show up in the docs, but not in the "How-to" section)

@@ -18,6 +18,7 @@ Note the following:

* ``@config`` If you're feeling adventurous, you can pass in a lambda function that takes in the entire configuration and resolves to ``True`` or ``False``. You probably don't want to do this.

* To always exclude a function (such as helper functions) from the DAG you can also use ``@hamilton_skip``.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention that we can use _ as a prefix (preferred) or @hamilton_skip

@elijahbenizzy
Copy link
Collaborator

Looks good, let's just change the wording of the docs to indicate that _ is preferred!

Copy link
Collaborator

@elijahbenizzy elijahbenizzy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Let's just change the wording to show that _ is preferred

The new name makes the functionality clearer.
@elijahbenizzy elijahbenizzy merged commit 3bc6c40 into DAGWorks-Inc:main Oct 12, 2024
24 checks passed
@jernejfrank jernejfrank deleted the feat/ignore branch October 21, 2024 03:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants