-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add show_graph to display a GraphViz plot for expressions #19365
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #19365 +/- ##
==========================================
- Coverage 80.02% 79.84% -0.19%
==========================================
Files 1528 1536 +8
Lines 209871 211405 +1534
Branches 2419 2445 +26
==========================================
+ Hits 167954 168800 +846
- Misses 41366 42050 +684
- Partials 551 555 +4 ☔ View full report in Codecov by Sentry. |
One thing I notice in the example is that the binary |
The order here matches what is done by tree_format. I don't see an option there for reordering. |
Couldn't we traverse in an alternative order? I think it's important to show the order of operations. |
So, in the TreeWalker that builds the output shown here (and the output for tree_format) the iteration is deliberately done backwards in the code...
So, I'm not really clear why the code does binary children backwards. |
It seems reasonable to me to only un-reverse the order for the GraphViz output and keep the existing behaviour of --- a/crates/polars-plan/src/plans/ir/tree_format.rs
+++ b/crates/polars-plan/src/plans/ir/tree_format.rs
@@ -902,12 +902,13 @@ impl fmt::Binary for TreeFmtVisitor {
if !cell.text.is_empty() {
// Add node
let node_label = &cell.text.join("\n");
- let node_desc = format!("n{i}{j} [label=\"{node_label}\"]");
+ let node_desc = format!("n{i}{j} [label=\"{node_label}\",ordering=\"out\"]");
relations.push(node_desc);
// Add child edges
if i < tree_view.rows.len() - 1 {
- for child_col in cell.children_columns.iter() {
+ // Iter in reversed order to undo the reversed child order when iterating expressions
+ for child_col in cell.children_columns.iter().rev() {
let next_row = i + 1;
let edge = format!("n{i}{j} -- n{next_row}{child_col}");
relations.push(edge); |
Sounds good. My only hesitation is on how many operators are reversed. But, looking through sample output it seems that most of them are so I think we can just do this in general rather than for specific operators. This will make our output different from tree_format but I guess that can be fixed in a future PR if the polars team feels it is worthwhile. |
Thanks @adamreeve , I've gone ahead and put in the changes as suggested and now this gives a cleaner output and more streamlined code. Graph below.
|
So that the iterator traverses left nodes first. The tree-walker should not be adapted. |
py-polars/tests/unit/operations/namespaces/files/test_show_graph.txt
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small nit, then it's good to go. Thanks @corwinjoy
Thanks @ritchie46 ! I have gone ahead and made the change. |
Rationale: We have found that the simple text output from
polars.Expr.meta.tree_format
can become difficult to read for large expressions. Therefore, we wanted to build on the logic intree_format
to also be able to produce a GraphViz output for more complex expressions. This PR follows the logic inpolars.LazyFrame.show_graph
to addpolars.Expr.meta.show_graph
Example: