Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter and flatten markers #4639

Merged
merged 5 commits into from
Jul 8, 2024
Merged

Filter and flatten markers #4639

merged 5 commits into from
Jul 8, 2024

Conversation

ibraheemdev
Copy link
Member

@ibraheemdev ibraheemdev commented Jun 28, 2024

Summary

More marker simplification:

  • Filters out redundant subtrees based on outer expressions, e.g. a and (a or b) simplifies to a.
  • Flattens nested trees internally, e.g. (a and b) and c

Resolves #4536.

pub(crate) fn normalize(tree: MarkerTree) -> Option<MarkerTree> {
pub(crate) fn normalize(mut tree: MarkerTree) -> Option<MarkerTree> {
filter_all(&mut tree);
normalize_all(tree)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there redundant expressions that we'd only notice after normalizing (e.g., flattening)? In other words, is filter_all(normalize_all(...)) idempotent? Or is there some risk that we don't normalize everything by way of this ordering?

Copy link
Member Author

@ibraheemdev ibraheemdev Jun 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filter_all was written to handle non-flattened trees and can also leave the tree in a denormalized (not flattened) state.

Thinking about it more though, I suppose it is possible that there are duplicates that only show up after combining version ranges, so maybe we have to normalize before and after? I was trying to avoid this by writing filter_all to handle non-flattened ranges, but maybe normalizing before is unavoidable. I can try to create a test case where this happens.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed and added a test.

@charliermarsh
Copy link
Member

Relatedly, after this change, if we have a marker that contains an OR with MarkerTree::And(vec![]), do we map that to None (or even MarkerTree::And(vec![]))? It's relevant for a separate PR I'm working on.

}
}

// Collect all expressions from a tree.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scroll-by Nit: // -> /// so it's rustdoc

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might add a little more rustdoc for cases like these. As it is, I think the comment is effectively strictly redundant with the name and type signature. I think I'd write something like this:

/// For a given slice of marker trees, this collects
/// all direct leaf expressions from each marker tree,
/// and skips any marker trees that are themselves
/// not leaves.

I might also add some context as to why this routine is useful.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some more documentation.

@ibraheemdev
Copy link
Member Author

@charliermarsh it does not, does that ever come up?

@charliermarsh
Copy link
Member

@ibraheemdev -- I was considering making some changes elsewhere that would benefit from it, but it's not critical. Although I do think OR MarkerTree::And(vec![]) should be simplified to MarkerTree::And(vec![]).

What's left to do here?

@charliermarsh
Copy link
Member

Another one we could do: #4852

Copy link
Member

@BurntSushi BurntSushi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this overall LGTM. This does seem like playing whack-a-mole, but I think we figured out it would require significant investment to do this fully correctly right?

}
}

// Collect all expressions from a tree.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might add a little more rustdoc for cases like these. As it is, I think the comment is effectively strictly redundant with the name and type signature. I think I'd write something like this:

/// For a given slice of marker trees, this collects
/// all direct leaf expressions from each marker tree,
/// and skips any marker trees that are themselves
/// not leaves.

I might also add some context as to why this routine is useful.

}
}

// Filters out matching expressions from any nested conjunctions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why this doesn't also filter them out for a direct leaf expression? I guess it would need to rewrite the marker expression as MarkerTree::And(vec![]) in that case...

Another perspective that I find helpful for routines like this is phrasing like, "returns a marker tree in which the given expression is assumed to be true in all conjunctions." I think that captures better the essence of the contract here if I'm understanding correctly.

Copy link
Member Author

@ibraheemdev ibraheemdev Jul 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's covered by the dedup calls in normalize_all. It's a bit trickier to do that here because we want to filter recursively, but only deduplicate from the outer conjunction or disjunction, depending on the context (we don't just want to remove the expression everywhere).

@@ -1,4 +1,4 @@
#![allow(clippy::enum_glob_use)]
#![allow(clippy::enum_glob_use, clippy::single_match_else)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh Clippy. (I agree with your stylistic choice here.)

@ibraheemdev ibraheemdev force-pushed the more-marker-simplification branch 3 times, most recently from cb06b14 to aee0891 Compare July 8, 2024 18:46
@ibraheemdev ibraheemdev enabled auto-merge (squash) July 8, 2024 18:55
@ibraheemdev ibraheemdev merged commit ed4234d into main Jul 8, 2024
49 checks passed
@ibraheemdev ibraheemdev deleted the more-marker-simplification branch July 8, 2024 18:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Apply Absorption Law when normalizing markers
4 participants