`Feature` Improved DAG visualization #512

zilto · 2023-11-02T00:11:15Z

I tried to improve the DAG visualization using the graphviz library. The visualization should be less cluttered and visually represent the expressivity of the Hamilton DAG (inputs, config, functions, materializers, overrides, etc.). I added a legend directory in the viz for better readability.

Changes

All the changes are made to the function create_graphviz_graph(). The algorithm follows the same general structure: build nodes, then build edges. Several utility functions were created to centralized relevant definitions:

create node label (same for all nodes except input nodes)
get the node type (input, config, function)
get node modifiers (override, output, collect, expand, materializer)
create a legend based on the node types of the DAG
added arguments to the internal function create_graphviz_graph() these should be wired with other user-facing visualization functions

Removed:

no longer display "Input" or "Override" prefix to reduce space used
the base shape for functions are rectangles / rounded rectangles, which take significantly less space than ellipses.

How I tested this

Manually tested it with several DAGs. I included DAGs that used a maximum number of features.

Notes

Currently doesn't implement the Parallel/Collect special edges (but both types can now be distinguished from the node style)
Modifiers are applied sequentially and may override each other. Currently, it is not the case because each modifier acts on a different property (shape, fillcolor, periphery color, style)
Graph, nodes, and edges have a style attribute which is a comma-separated list with heterogeneous attributes. These attributes are hard to manage and overriden as a whole when updating style

Checklist

PR has an informative and human-readable title (this will be pulled into the release notes)
Changes are limited to a single goal (no scope creep)
Code passed the pre-commit check & code is left cleaner/nicer than when first encountered.
Any change in functionality is tested
New functions are documented (with a description, list of inputs, and expected output)
Placeholder code is flagged / future TODOs are captured in comments
Project documentation has been updated if adding/changing functionality.

sweep-ai · 2023-11-02T00:12:19Z

Apply Sweep Rules to your PR?

Apply: Leftover TODOs in the code should be handled.
Apply: All new business logic should have corresponding unit tests in the tests/ directory.
Apply: Any clearly inefficient or repeated code should be optimized or refactored.

elijahbenizzy

Looking good -- a few comments:

Let's add some comments in the code to explain what we're doing -- not always clear
Walrus operator is not supported in 3.7 (maybe __future__?). Happy to use it but that makes this dependent on killing 3.7 which is coming soon, just waiting on pyarrow support for 3.12)
Unit tests seem to be failing -- we can probably update them to test this or remove the ones that were too specific in the first place.

hamilton/graph.py

elijahbenizzy · 2023-11-02T01:52:04Z

hamilton/graph.py

+
+    def _get_function_modifier_style(modifier: str):
+        if modifier == "output":
+            modifier_style = dict(fillcolor="#FFC857")


This will overwrite the prior fillcolor in certain cases, no?

Yes, but the only type of node with a fillcolor is "function nodes" (the default).

I wonder how people design visualization software, but would it make sense to have a sort of lexicon of all the possible combinations for internal purposes?

Yeah, I think I'm likely overthinking it though? As in, we can see what feedback we get?

hamilton/graph.py

elijahbenizzy · 2023-11-02T01:54:34Z

hamilton/graph.py

+            ),
+        )
+
+        sorted_types = [


Nit -- maybe put these in order of commonality? So scan order is useful?

I moved materializer downwards

It's a bespoke ordering, but I thought about having config and input first because they most often appear at the top of the graph near the legend. Then, all others are of function type with modifiers.

hamilton/graph.py

zilto · 2023-11-02T17:39:21Z

tests/test_graph.py

@@ -501,8 +502,9 @@ def test_function_graph_has_cycles_false():
    assert fg.has_cycles(nodes, user_nodes) is False


-def test_function_graph_display():
+def test_function_graph_display(tmp_path: pathlib.Path):


This test was initially making two assertions:

is the content of the file valid

does it not create a file when passed output_file_path = None

I split this into two tests: test_function_graph_display() and test_function_graph_display_no_dot_output()

zilto · 2023-11-02T17:41:26Z

tests/test_graph.py

+    dot = dot_file_path.open("r").readlines()
+    dot_set = set(dot)
+
+    assert dot_set.issuperset(expected_set) and len(dot_set.difference(expected_set)) == 1


Because of the sorting issues for DOT file lines and now the content of input nodes (order of rows in the table), I used sets instead. The DOT lines expected_set have the input node commented out. Therefore, the newly produced DOT file dot_set should be a superset of expected_set with exactly one more line for the input node.

elijahbenizzy

LGTM! Let's 🚢 🇮🇹

improved DAG viz from graph.create_graphviz_graph()

c62db2b

elijahbenizzy reviewed Nov 2, 2023

View reviewed changes

zilto added 5 commits November 2, 2023 11:04

added docstrings, comments, and edge logic

d7e368d

added parameter to show/hide legend

400964b

added tests for show_legend, orient, and hide_inputs

0d755f0

added parameters to

92dc5e7

fixed failing tests for graph.py

34539f1

zilto commented Nov 2, 2023

View reviewed changes

zilto added 4 commits November 2, 2023 13:49

removed type hint to pass 3.7 and 3.8 tests

289676f

fixed Python 3.7 and 3.8 type hints in graph.py

4f05bec

yet another type annotation

3536518

yet yet another type annotation for 3.7

775dbf2

elijahbenizzy self-requested a review November 3, 2023 17:58

elijahbenizzy approved these changes Nov 3, 2023

View reviewed changes

zilto merged commit 1d9fc7d into main Nov 3, 2023

zilto deleted the example/better-viz branch November 3, 2023 18:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`Feature` Improved DAG visualization #512

`Feature` Improved DAG visualization #512

zilto commented Nov 2, 2023 •

edited

Loading

sweep-ai bot commented Nov 2, 2023

elijahbenizzy left a comment

elijahbenizzy Nov 2, 2023

zilto Nov 2, 2023

elijahbenizzy Nov 2, 2023

elijahbenizzy Nov 2, 2023

zilto Nov 2, 2023

elijahbenizzy Nov 2, 2023

zilto Nov 2, 2023

zilto Nov 2, 2023

elijahbenizzy left a comment

Feature Improved DAG visualization #512

Feature Improved DAG visualization #512

Conversation

zilto commented Nov 2, 2023 • edited Loading

Changes

How I tested this

Notes

Checklist

sweep-ai bot commented Nov 2, 2023

Apply Sweep Rules to your PR?

elijahbenizzy left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

elijahbenizzy left a comment

Choose a reason for hiding this comment

`Feature` Improved DAG visualization #512

`Feature` Improved DAG visualization #512

zilto commented Nov 2, 2023 •

edited

Loading