-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify minterm generation in Regex NonBacktracking engine #61232
Conversation
PartitionTree had a holdover from when it was tracking the predicates that made up each minterm, which caused it to use order n too much memory for some patterns. Nodes with the same predicate as the parent are no longer created. This also significantly simplified the algorithm.
Tagging subscribers to this area: @eerhardt, @dotnet/area-system-text-regularexpressions Issue DetailsPartitionTree had a holdover from when it was tracking the predicates that made up each minterm, which caused it to use order n too much memory for some patterns: nodes with the same predicate as the parent are no longer created. This also significantly simplified the algorithm. The effect is particularly evident in patterns with lots of non-overlapping predicates. For the pattern "qwertyuuiopasdfghjklzxcvbnm" this dropped the number of PartitionTree nodes created from 378 to 53.
|
@@ -41,25 +40,17 @@ public MintermGenerator(IBooleanAlgebra<TPredicate> algebra) | |||
/// <returns>all minterms of the given predicate sequence</returns> | |||
public List<TPredicate> GenerateMinterms(params TPredicate[] preds) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As long as you're improving things here, does this need to be params
or could that be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question, I don't think it has to be. I'll change it to match what this is called with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Less code and cheaper. Yay :)
Also remove the deduplication in there, since its now optional and the only call site already passes in a unique set of predicates.
PartitionTree had a holdover from when it was tracking the predicates that made up each minterm, which caused it to use order n too much memory for some patterns: nodes with the same predicate as the parent are no longer created. This also significantly simplified the algorithm.
The effect is particularly evident in patterns with lots of non-overlapping predicates. For the pattern "qwertyuuiopasdfghjklzxcvbnm" this dropped the number of PartitionTree nodes created from 378 to 53.