Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UI fix code display for temporary modules #860

Closed
wants to merge 1 commit into from
Closed

Conversation

zilto
Copy link
Collaborator

@zilto zilto commented Apr 29, 2024

Temporary modules, most commonly generated in notebook cells, were throwing errors with the hamilton_sdk.adapters.HamiltonTracker. Then, the UI wouldn't properly display the registered code artifacts.

The fixes aim to solve the same issues as #855. Given the problems are related to the Hamilton SDK, it seems better to apply changes there than in the Hamilton library where module_from_source is used in contexts with different constraints (CLI, LSP, VSCode extension, Jupyter extension, and more)

Changes

  1. _hash_module() doesn't expect for modules to have a __package__ attribute. Modules imported in a Jupyter cell wouldn't have this attribute set, for example pd in the following. This changes makes the HamiltonTracker fully functional.
%%cell_to_module
import pandas as pd 

def A() -> pd.DataFrame:
  return ...
  1. After change 1, the source code associated with each node wouldn't be displayed. This is because _slurp_code() looks for a file instead of the module object itself. If module.__file__ is None we now read the source code from the linecache (set by ad_hoc_utils.module_from_source(). Then, the path attribute in _slurp_code() needs to match the path attribute in extract_code_artifacts_from_function_grap()

How I tested this

  • hamilton tests ran successfully
  • hamilton_sdk tests ran successfully except two tests related to hashing
  • source code renders properly in the UI for script and notebook usage.

⛔ This makes two SDK test fail. It's probably sufficient to update the hash value, but wanted to double check

================================== short test summary info ==================================
FAILED tests/test_driver.py::test_hash_module_with_subpackage - AssertionError: assert '4466d1f61b2c...ef83e3d6c6bf8' == '9f0b697b6b07...d772b8bd392c7'
FAILED tests/test_driver.py::test_hash_module_complex - AssertionError: assert 'c22023a4fdc8...b789b80f19896' == 'fc568608a2f7...104c1ba2a5e1c'

Notes

Checklist

  • PR has an informative and human-readable title (this will be pulled into the release notes)
  • Changes are limited to a single goal (no scope creep)
  • Code passed the pre-commit check & code is left cleaner/nicer than when first encountered.
  • Any change in functionality is tested
  • New functions are documented (with a description, list of inputs, and expected output)
  • Placeholder code is flagged / future TODOs are captured in comments
  • Project documentation has been updated if adding/changing functionality.

Copy link
Collaborator

@elijahbenizzy elijahbenizzy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No UI fixes? It just works on the client side?

"""Create a temporary module from source code"""
module_name = _generate_unique_temp_module_name()
module_name = module_name if module_name else _generate_unique_temp_module_name()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably should be using a hash of the source. E.G. so we don't repeat it a ton of times. Not sttrictly necessary but feels good.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's best to have module_name as a required argument. However, didn't want to make that change now to limit the scope of this PR

Copy link
Collaborator

@skrawcz skrawcz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if I change the jupyter magic to use the source version, it does not seem to work properly when I change the cell and update the code -- the source code tracked is still the same from the first iteration.

@zilto
Copy link
Collaborator Author

zilto commented Apr 30, 2024

if I change the jupyter magic to use the source version, it does not seem to work properly when I change the cell and update the code -- the source code tracked is still the same from the first iteration.

I believe I will need to edit line 52

if hasattr(module, "__file__") and module.__file__ is not None:

@zilto
Copy link
Collaborator Author

zilto commented Apr 30, 2024

@skrawcz my understanding is that these changes won't be merged as we're rolling create_module() for the Jupyter magic v2. Feel free to close this branch if that's the case

@zilto
Copy link
Collaborator Author

zilto commented Apr 30, 2024

While working on other things found this code. It shows the risk of constraints of a particular use case creeping to the main library.

    @functools.cached_property
    def version(self) -> typing.Optional[str]:
        # ... 
        try:
            # return hash of first function. It could be that others are Hamilton framework code.
            return hash_source_code(self.originating_functions[0], strip=True)
        except (
            OSError
        ):  # TODO -- ensure we can get the node hash in a databricks environment when using jupyter magic
            logger.warning(
                f"Failed to hash source code for node {self.name}. Certain environments (such as databricks) do not allow it."
                " In this case, version will be None."
            )
            return None

@zilto zilto closed this May 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants