`plugin` MLFlow #945

zilto · 2024-06-10T23:49:34Z

The MLFlow plugin for Hamilton includes two sets of features:

Save and load machine learning models with the MLFlowModelSaver and MLFlowModelLoader materializers
Automatically track data pipeline results in MLFlow with the MLFlowTracker.

Changes

hamilton/plugins/mlflow_extensions.py contains the materializers
hamilton/plugins/h_mlflow.py contains the tracker
examples/mlflow/tutorial.ipynb is a tutorial notebook

How I tested this

tests/plugins/test_mlflow_extensions.py tests the materializers
currently no tests for the tracker

TODO / ideas

display_all_functions() and visualize_execution() from the HamiltonTracker to store in the MLFlow tracking server
strong coupling between MLFlow experiments and runs with Hamilton UI projects and runs. The MLFlow UI allows for markdown fields which could contain a link to the Hamilton UI.
log "datasets" used for tracking in the MLFlow UI. Accepts a "digest" which is equivalent to our fingerprinting concept
log model input signatures
add example on how to add hyperparameter tuning with nested runs
explore the MLFlow features for LLMs and evaluation

ellipsis-dev

❌ Changes requested. Reviewed everything up to 70e620b in 1 minute and 16 seconds

More details

Looked at 734 lines of code in 7 files
Skipped 1 files when reviewing.
Skipped posting 0 drafted comments based on config settings.

Workflow ID: wflow_q2KEKG2CuxHXzKwa

Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

hamilton/plugins/h_mlflow.py

hamilton/plugins/mlflow_extensions.py

skrawcz

Only big one is Any typing. Would be nice to constrain it, but if it's too hard I guess that's the trade-off.

Otherwise minor things.

skrawcz · 2024-06-11T01:17:40Z

examples/mlflow/README.md

+
+This pairs nicely with the `HamiltonTracker` and the [Hamilton UI](https://hamilton.dagworks.io/en/latest/hamilton-ui/ui/) which gives you execution observability.
+
+We're looking forward to better link Hamilton "projects" with MLFlow "experiments" and runs from both projects.


Suggested change

We're looking forward to better link Hamilton "projects" with MLFlow "experiments" and runs from both projects.

We're looking forward to the better linking of Hamilton "projects" with MLFlow "experiments" and runs from both projects.

examples/mlflow/README.md

hamilton/plugins/mlflow_extensions.py

elijahbenizzy

A few comments, overall looks good. Only thing I'm iffy about is a lot of non-transparent coupling between the materializer types and MLFlow. Can we add a docs section about this?

Also, can we add this to the docs? We have docstrings + everything. This should at least go in reference

elijahbenizzy · 2024-06-11T14:32:58Z

examples/mlflow/tutorial.ipynb

Error -- rerun? MlflowException: Invalid experiment name: Ellipsis. Expects a string.

Yeah, I added code snippets at the end for demonstration purposes. I turned them into markdown snippets

elijahbenizzy · 2024-06-11T14:33:34Z

hamilton/plugins/h_mlflow.py

+        run_description: Optional[str] = None,
+        log_system_metrics: bool = False,
+    ):
+        """Configure the MLFlow client and experiment for the lifetime of the tracker


Show usage in docstring? This should go in docs.

elijahbenizzy · 2024-06-11T14:34:39Z

hamilton/plugins/h_mlflow.py

+            # special case for matplotlib and plotly
+            # log materialized figure. Allows great degree of control over rendering format
+            # and also save interactive plotly visualization as HTML
+            elif node_tags["hamilton.data_saver.sink"] in ["plt", "plotly"]:


Add todo -- we'll want to add more things (datasets, etc...) that can come from this tag. Wil want to be a dict rather than a if/elif statement

hamilton/plugins/h_mlflow.py

elijahbenizzy · 2024-06-11T14:36:57Z

hamilton/plugins/mlflow_extensions.py

+    model_name: Optional[str] = None
+    version: Optional[Union[str, int]] = None
+    version_alias: Optional[str] = None
+    flavor: Optional[str] = None


"flavor" isn't a great name -- is there more typing (E.G. a Literal[])?

If its just mlflow's name then that makes sense

This is the MLFlow terminology, didn't reinvent the wheel here

Each flavor is a library / backend and each have multiple types. We shouldn't explicitly handle them and let that be delegated to MLFlow

ellipsis-dev · 2024-06-11T15:02:10Z

Skipped PR review on f73d81e because no changed files had a supported extension. If you think this was in error, please contact us and we'll fix it right away.

Generated with ❤️ by ellipsis.dev

skrawcz

@zilto can you write a nice squash merge commit -- i.e. something to explain any choices/things you didn't add support for but could, etc.

zilto added 5 commits June 6, 2024 14:05

added MLFlow model materialziers with tests

c695813

added saving example and model registration

3676d55

added MLFlowTracker; updated tests; updated materializer API

e82c887

removed alias from Saver

be20b05

added README & tutorial; updated test requirements

70e620b

ellipsis-dev bot reviewed Jun 10, 2024

View reviewed changes

hamilton/plugins/h_mlflow.py Show resolved Hide resolved

hamilton/plugins/mlflow_extensions.py Show resolved Hide resolved

hamilton/plugins/mlflow_extensions.py Outdated Show resolved Hide resolved

zilto closed this Jun 11, 2024

zilto reopened this Jun 11, 2024

zilto marked this pull request as draft June 11, 2024 01:03

zilto marked this pull request as ready for review June 11, 2024 01:04

skrawcz reviewed Jun 11, 2024

View reviewed changes

zilto added 3 commits June 11, 2024 10:29

fix if condition

1a93b57

fixed README typo

047cc4e

updated README

9430836

elijahbenizzy reviewed Jun 11, 2024

View reviewed changes

zilto added 2 commits June 11, 2024 10:39

added a test for HamiltonTracker and DataSaver contract

f3a2e0e

updated tutorial notebook

f73d81e

added docs, added TODOs

6caf06e

This comment was marked as resolved.

Sign in to view

added metadata to registered models

98da37a

This comment was marked as resolved.

Sign in to view

zilto mentioned this pull request Jun 12, 2024

Adds Simple MLFlow materializer #625

Closed

7 tasks

skrawcz approved these changes Jun 12, 2024

View reviewed changes

zilto merged commit 881010e into main Jun 12, 2024
23 checks passed

zilto deleted the plugin/mlflow branch June 12, 2024 23:44

zilto mentioned this pull request Jun 12, 2024

🚧 Adds MLflow materializer #358

Closed

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`plugin` MLFlow #945

`plugin` MLFlow #945

zilto commented Jun 10, 2024

ellipsis-dev bot left a comment

skrawcz left a comment

skrawcz Jun 11, 2024

elijahbenizzy left a comment

elijahbenizzy Jun 11, 2024

zilto Jun 11, 2024

elijahbenizzy Jun 11, 2024

elijahbenizzy Jun 11, 2024

elijahbenizzy Jun 11, 2024

elijahbenizzy Jun 11, 2024

zilto Jun 11, 2024

zilto Jun 11, 2024 •

edited

Loading

ellipsis-dev bot commented Jun 11, 2024

This comment was marked as resolved.

This comment was marked as resolved.

skrawcz left a comment


		This pairs nicely with the `HamiltonTracker` and the [Hamilton UI](https://hamilton.dagworks.io/en/latest/hamilton-ui/ui/) which gives you execution observability.

		We're looking forward to better link Hamilton "projects" with MLFlow "experiments" and runs from both projects.

	We're looking forward to better link Hamilton "projects" with MLFlow "experiments" and runs from both projects.
	We're looking forward to the better linking of Hamilton "projects" with MLFlow "experiments" and runs from both projects.

plugin MLFlow #945

plugin MLFlow #945

Conversation

zilto commented Jun 10, 2024

Changes

How I tested this

TODO / ideas

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

skrawcz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

elijahbenizzy left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zilto Jun 11, 2024 • edited Loading

Choose a reason for hiding this comment

ellipsis-dev bot commented Jun 11, 2024

This comment was marked as resolved.

This comment was marked as resolved.

skrawcz left a comment

Choose a reason for hiding this comment

`plugin` MLFlow #945

`plugin` MLFlow #945

zilto Jun 11, 2024 •

edited

Loading