Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes jupyter magic to create temporary files #855

Merged
merged 2 commits into from
Apr 30, 2024

Conversation

skrawcz
Copy link
Collaborator

@skrawcz skrawcz commented Apr 26, 2024

Temporary files allow us to then hash the code properly for use with the Hamilton UI.

This does not change any functionality. If the interpreter is shutdown gracefully then the temporary files are deleted cleanly.

Changes

  • adjust function that creates the module in the jupyter magic

How I tested this

  • locally
  • on databricks

Notes

  • this means magics in notebooks can be cleanly used with the Hamilton UI.

Checklist

  • PR has an informative and human-readable title (this will be pulled into the release notes)
  • Changes are limited to a single goal (no scope creep)
  • Code passed the pre-commit check & code is left cleaner/nicer than when first encountered.
  • Any change in functionality is tested
  • New functions are documented (with a description, list of inputs, and expected output)
  • Placeholder code is flagged / future TODOs are captured in comments
  • Project documentation has been updated if adding/changing functionality.

Copy link
Collaborator

@elijahbenizzy elijahbenizzy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this to ad-hoc utils?

@zilto
Copy link
Collaborator

zilto commented Apr 29, 2024

By adding a guard clause, the already implemented module_from_source properly behaves for logging in the UI.

The issue was in value.__package__.startswith(module.__package__) because .startswith() expects a string and throws an error for module.__package__ is None. (source)

    for name, value in inspect.getmembers(module):
        # Check if the attribute is a module
        if inspect.ismodule(value):
            if value.__package__ is None:
                logger.info(
                    f"Skipping hash for module {value.__name__} because it has no __package__ "
                    f"attribute or it is None. This happens with lazy loaders."
                )
                continue
            # Check if the module is in the same top level package
            if module.__package__:
              if value.__package__ != module.__package__ and not value.__package__.startswith(
                  module.__package__
              ):
                logger.debug(
                  f"Skipping hash for module {value.__name__} because it is in a different "
                  f"package {value.__package__} than {module.__package__}"
                )
                continue
            
            # Recursively hash the sub-module
            hash_object = _hash_module(value, hash_object, seen_modules)

    # Return the hash object
    return hash_object

As far as I understand, this code logic is meant to reduce redundant checks by skipping versioning of members of the same module. Skipping checks affects what is included in the hash computation, but this raises the question:

  • should the same code found in a notebook cell and in a Python module have the same hash?
  • if they should have the same hash, dropping this logic to skip modules, which only provides truly marginal speed benefits (vs. running the actual dataflow), should be preferred to relying on tempfiles and copying code

Currently working on fixing the logic to log code. This should be available on HamiltonNode objects

skrawcz added 2 commits April 29, 2024 23:39
Temporary files allow us to then hash the code properly for
use with the Hamilton UI.

This does not change any functionality. If the interpreter is shutdown
gracefully then the temporary files are deleted cleanly.
This is a better place to live.

Makes note that one function is deprecated.
@skrawcz skrawcz force-pushed the adjust_magic_module_import branch from f51cc4c to 714d9cd Compare April 30, 2024 06:52
@skrawcz skrawcz merged commit 931f7ee into main Apr 30, 2024
23 checks passed
@skrawcz skrawcz deleted the adjust_magic_module_import branch April 30, 2024 07:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants